Skip to main content
The NCI Community Hub will be retiring in May 2024. For more information please visit the NCIHub Retirement Page:https://ncihub.cancer.gov/groups/ncihubshutdown/overview
close
  • Discoverability Visible
  • Join Policy Open/Anyone
  • Created 06 Dec 2013

Nano WG February 26, 2015

  1. Mervi Heiskanen

    Curator responsibilities: Lead author Karmann Mills. Additional contributors: Mark Hoover, Stephanie Morris, Egon Willighagen, Marty Fritts, Jennifer Jones.

    Covers curation responsibilities, including established and developing roles and division of curation labor and exploring the real challenges associated with quantity vs. quality of data entries.  

    Curation training and performance expectations will also be addressed, as will the roles of other non-curators in defining the curation process (e.g. how might data “customers”, such as peer-reviewed journals, influence the process, professional societies).

    Tools for autocuration - how does this impact curator responsibilities, how does it define the curators role, where does human involvement occur, overall impact of automation – benefits and challenges/pitfalls

    NIST example (Marty),

    LIMS interactions with curator?

    Natural language – Sharon Gaheen example

    Text mining – Kaizhi Tang example

    Experience of authors as examples – fill in with stakeholder response

    Supervised approach – to avoid pitfalls

    Figure – customers, creators, curators, analysts, users, – interactions of each type (add main responsibilities) with curators – ‘curator habitat’; non-static process; feedback

     

    Data integration: Lead authors Sharon Gaheen and Egon Willighagen. Additional contributors Marty Fritts, Christine Hendren, Dennis Thomas, Stacey Harper, Mark Hoover, Richard Marchese-Robinson, Karmann Mills, John Rumble.

    How do we define and operationalize integration between databases and datasets?  What level of interoperability is required to support data integration in a way that supports various goals for comparison and analysis?

    Specific topics that can be challenges to interoperability will be discussed, for example, questions such as what is the primary key – the root or kernel that makes an individual record unique? Some infrastructures base the primary key on the nanomaterial, whether on the batch, the lot level, or just the product name.  Others utilize a particular study or experiment as the basis around which the structure is oriented. 

    This definition of a unique entry into a database is fundamental to the structure of the database, often differs between different resources, and greatly impacts how data are curated from a source. 

    Finding ways to map across these differences in record definition will be an important consideration.

    Model – Global Alliance for Genomics; standard services desired by community (stakeholders/authors)

    Figure -

     

    Metadata: Lead authors Yoram Cohen and Christine Hendren.  Additional contributors Fred Klaessig, Stacey Harper, Sandra Karcher, Katrina Varner, Karmann Mills, Marty Fritts.

     

    Characteristics of metadata – consistency of attributes across fields

    Hierarchy of metadata?  Critical elements to capture?  Specific for particular goal?

    The way metadata are handled within a database and within data records is critical to every other nanotechnology data curation topic listed. 

    For example, environmental and biological media characterizations are critical for interpretation as well as comparison of data. 

    Temporal metadata are also key – instances of characterization.

    Environmental inclusion.

    How experimental and characterization timing is incorporated to data collection and infrastructure is integral to enabling reproducibility of data and to achieving functional interoperability between datasets.

    Size (HDD, primary particle size, NTA, shape factors) – intrinsic vs. extrinsic factors (conditional behaviors)

    Importance of how the data were obtained – instrumentation, process (metadata within metadata) – protocols needed for interpretation

    Sonication example – timing, duration, energy input

    Context of study - Relevance of concentration (or exposure scenario) to the material being studied and its potential uses – assist in translation to risk assessment ultimately

    Dependency on the question you are asking, common themes?

    Independent - Uncertainty, protocols, best practices, controls

    Figure – effective metadata use to ensure proper use of data, focus on uses of metadata, relevant reliable metadata (focus), orients the user of the relevance/applicability of data

    Report abuse