Frequently Asked Questions
Where does the data come from?
- Grants - EGRP grants data come from the NIH IMPAC-II database. Active grants are those that are currently using NIH funds to conduct research. Inactive grants are those that are no longer conducting research using NIH funds.
- Publications – Publications data that are linked to EGRP grants are supplemented by additional data from PubMed. Publications from DCEG comes from the DCEG Publications database. Staff publications come from EGRP staff members. Cancer genomic epidemiology literature comes from the Cancer HuGE Literature Finder, a database part of the HuGE Navigator.
- Genomic Tests – Genomic test data come from the Cancer Genomic Test Finder, a database part of the HuGE Navigator.
- Genomic Evidence – Genomic evidence data come from the Cancer Evidence Aggregator, a database part of the HuGE Navigator.
- Cancer GAMA – Summary genetic risk association data come from Cancer GAMAdb, a database part of the HuGE Navigator.
- dbGAP – EGRP grants that have deposited genomic data into the NIH Genotype and Phenotype Database.
How can CGEN benefit me?
CGEN offers many features to help you find information related to cancer epidemiology. Here are some suggestions:
- Evaluate the EGRP Active Grants to identify what the Epidemiology and Genomics Research Program (EGRP) is currently funding to identify gaps in the NCI-supported grant portfolio to guide your research agenda.
- Search CGEN to broadly find information on an exposure by entering the exposure (e.g., smoking). CGEN will search expansively across all data sources for linked data on the exposure and will return related results.
- Peruse publications from EGRP Active Grants to see what peer-reviewed articles have resulted from grants currently funded by EGRP. The same can be done for inactive grants.
- Identify potential collaboration with experts in your interest areas, for example, by using filter options to peruse through the list of principal investigator linked to respective EGRP active/inactive grants and cancer site.
- Identify an NCI study section to submit your proposed application by searching through past and current EGRP grants with similar features.
What is dbGaP link?
The dbGaP link is used to select grants that have genotype and phenotype data associated deposited into the NIH Genotype and Phenotype Database. These data are available for public use after through a thorough access process.
How can I find the publications that are related to a grant?
If you wish to see the publications associated with a particular grant, use the search and filtering mechanisms to find the grant you are interested in and then click on that grant entry to view the detail page for that grant. The related publications will be listed in a sidebar on the right hand side of the details page.
If you wish to see the publications related to a set of grants, use the search and filtering mechanisms to subset the list of grants to under 500. Then, on the grants tab, you will see a “Show only linked data”. Click that button and you will see another button appear to “View linked data tree”. Click that button to view a listing of related publications and other related data for this set of grants.
How can I find the grants that are related to a publication?
If you wish to see the grants associated with a particular publication, use the search and filtering mechanisms to find the publication you are interested in and then click on that publication entry to view the detail page for that publication. The related grants will be listed in a sidebar on the right hand side of the details page.
If you wish to see the grants related to a set of publications, use the search and filtering mechanisms to subset the list of publications to under 500. Then, on the publications tab, you will see a “Show only linked data”. Click that button and you will see another button appear to “View linked data tree”. Click that button to view a listing of related grants and other related data for this set of publications.
Can you explain the “Show only linked data” feature?
This feature can be used to display the related data for a subset of publications, grants or CancerGAMA data. Publications can be related to a grant or grants and also related to CancerGAMA data. These data are linked by PubMed ID and by Grant ID. For any subset of data that is under 500 records, you can see the related data by clicking on the “Show only linked data” button. If you are on the publications tab when you click “Show only linked data”, you will see the other tabs counts be reduced to only those records that are related to the set of parent data. Currently, Genomic Tests and Genomic Evidence are not linked to any other data. The parent tab is indicated by an arrow and the child tabs are indicated by the infinity sign.
What does “Show all (unlinked) data” do?
If you see the button with this label, this means that you have chosen a parent tab and have clicked on the “Show only linked data” button on that parent tab. The parent tab is indicated by an arrow and the child tabs are indicated by the infinity sign. Clicking on the “Show all (unlinked) data” will display data subset only by the previously applied search criteria and not because data is linked.
If I search for Breast Cancer is it searching for “Breast” and “Cancer” or is it searching for “Breast” or “Cancer”?
The default operator for CGEN is the “and”. There is, currently, no way to conduct an “or” search. All search terms will be “anded” together. A more advanced search mechanism will be implemented for CGEN 2.0.
How can I search for multiple terms in exactly the order I want?
If you wish to search for a phrase, use the double quotes. For example, searching for “Study of Childhood Leukemia by Hispanic Status”, and including the double quotes in your search string will only bring back results for entries that have that particular phrase. In contrast, if you were to search for that same phrase without the double quotes, it would bring back all entries that have the words “Study” and “Childhood” and “Leukemia” and “Hispanic” and “Status” in any order in any fields of the entry.
Is the search doing an exact match?
It is not. The CGEN search engine finds not only exact matches to what you key into the search box, but also words that the search engine deems are similar enough. It uses a combination of stopwords and stemming. Stopwords are words that are very common and very short and are likely to be found in every entry. Examples of stopwords are “and”, “or”, “the”, etc. The CGEN search engine ignores these words. Stemming is a technique that is applied to attempt to use root words to find similar words. An example is if you enter the term “relates”, the search engine will return entries that have the word “relates” but also entries that have the words “relate”, “related”, “relation”, etc.
How are the search results sorted?
The search results are sorted by relevance. This means that if you search for “breast cancer”, the search results will be sorted by the number of times your search term appeared in each entry. If you wish for the search results to be sorted by something else, simply click on the column header in the search results table.
Can I sort the search results by something else?
Yes, you can. Simply click on a column header in the table you wish to sort. For example, if you are on the CancerGAMA tab and wish to sort the search results by Cancer Site, simply click on the underlined Cancer Site column header in the search results table.