GDC Data Portal User's Guide TCGA User

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 114

DownloadGDC Data Portal User's Guide TCGA User
Open PDF In BrowserView PDF
GDC Data Portal User’s Guide
NCI Genomic Data Commons (GDC)

Contents
1 Getting Started

7

Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

The GDC Data Portal: An Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

Accessing the GDC Data Portal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

Navigation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

Table Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

Table Arrangement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

Table Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

Table Export . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

Filtering and Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

Facet Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

Quick Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

Advanced Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

Manage Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

2 Projects

19

Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

Projects Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

Visualizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

Top Mutated Cancer Genes in Selected Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

Case Distribution per Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

Projects Table

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

Projects Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

Facets Panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

Project Summary Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

Most Frequently Mutated Genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

Survival Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

1

Most Frequent Mutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

Most Affected Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

3 Exploration

30

Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

Filters / Facets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

Case Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

Upload Case Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

Gene Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

Upload Gene Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

Mutation Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

Upload Mutation Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

Genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

Mutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

44

OncoGrid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

OncoGrid Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

File Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

Results

4 Repository
Repository

50
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

Filters / Facets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

Facets Panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

Adding Custom Facets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53

Files List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

Cases List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

56

Navigation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

Case Summary Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

Clinical and Biospecimen Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

59

Biospecimen Search

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

60

Most Frequent Somatic Mutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

File Summary Page

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

BAM Slicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

2

5 Genes and Mutations

64

Gene and Mutation Summary Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

Gene Summary Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

External References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

65

Cancer Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

65

Protein Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

Most Frequent Mutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

Mutation Summary Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

External References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

Consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

Cancer Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

Protein Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

70

6 Custom Set Analysis

71

Custom Set Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71

Generating a Cohort for Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71

Analysis Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

72

Analysis Page: Set Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

72

Analysis Tab: Cohort Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73

Analysis Page: Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

74

7 Annotations

76

Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

76

Annotations View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

76

Facets Panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

Annotation Categories and Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

Annotation Detail Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

8 Advanced Search

79

Advanced Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

Overview: GQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

Switching between Advanced Search and Facet Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

80

Using the Advanced Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

81

Auto-complete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

81

Field Auto-complete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

81

Value Auto-complete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

81

Setting Precedence of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82

Keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82

3

AND Keyword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82

OR Keyword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

List of Operators and Query format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

“=” operator - EQUAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

“!=” operator - NOT EQUAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

“>” operator - GREATER THAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

“>=” operator - GREATER THAN OR EQUALS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

“<” operator - LESS THAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

“<=” operator - LESS THAN OR EQUALS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

“IN” Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

“EXCLUDE” Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

“IS MISSING” Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

“NOT MISSING” Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

Date format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

Using Quotes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

Age at Diagnosis - Unit in Days . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

Fields Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

9 Authentication

89

Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89

Logging into the GDC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89

User Profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

91

GDC Authentication Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

91

Logging Out

92

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10 File Cart

93

Cart and File Download . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

GDC Cart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

Cart Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

Cart Items

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

94

Download Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

94

GDC Data Transfer Tool

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

Individual Files Download . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

Controlled Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

4

11 Legacy Archive

97

Legacy Archive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

File Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

98

Archive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

99

Metadata files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

99

File Cart

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12 Release Notes

99
100

Data Portal Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Release 1.11.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
New Features and Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Bugs Fixed Since Last Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Known Issues and Workarounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Release 1.10.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
New Features and Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Bugs Fixed Since Last Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Known Issues and Workarounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Release 1.9.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
New Features and Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Bugs Fixed Since Last Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Known Issues and Workarounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Release 1.8.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
New Features and Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Bugs Fixed Since Last Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Known Issues and Workarounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Release 1.6.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
New Features and Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Bugs Fixed Since Last Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Known Issues and Workarounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Release 1.5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
New Features and Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Bugs Fixed Since Last Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Known Issues and Workarounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Release 1.4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
New Features and Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Bugs Fixed Since Last Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Known Issues and Workarounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Release 1.3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

5

New Features and Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Bugs Fixed Since Last Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Known Issues and Workarounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Release 1.2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
New Features and Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Bugs Fixed Since Last Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Release 1.1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
New Features and Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Bugs Fixed Since Last Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Known Issues and Workarounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Release 1.0.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
New Features and Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Bugs Fixed Since Last Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Known Issues and Workarounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

6

Chapter 1

Getting Started
Getting Started
The GDC Data Portal: An Overview
The Genomic Data Commons (GDC) Data Portal provides users with web-based access to data from cancer genomics studies.
Key GDC Data Portal features include:
• Open, granular access to information about all datasets available in the GDC
• Advanced search and visualization-assisted filtering of data files
• Data visualization tools to support the analysis and exploration of data (including on a gene and mutation level from
Open-Access MAF files)
• Cart for collecting data files of interest
• Authentication using eRA Commons credentials for access to controlled data files
• Secure data download directly from the cart or using the GDC Data Transfer Tool
For more information about available datasets, see the GDC Website.

Accessing the GDC Data Portal
The GDC Data Portal is accessible using a web browser such as Chrome, Internet Explorer, and Firefox at the following URL:
https://portal.gdc.cancer.gov
The front page displays a summary of all available datasets:

7

Navigation
Views
The GDC Data Portal provides five navigation options (Views) for browsing available harmonized datasets:

• Projects: The Projects link directs users to the Projects Page, which gives an overall summary of project-level information,
including the available data for each project.
• Exploration: The Exploration link takes users to the Exploration Page, which allows users to explore data by utilizing
various case, genes and mutation filters.
8

• Analysis: The Analysis link directs users to the Analysis Page. This page has features available for users to compare
different cohorts. These cohorts can either be generated with existing filters (e.g. males with lung cancer) or through custom
selection.
• Repository: The Repository link directs users to the Repository Page. Here users can see the data files available for
download at the GDC and apply file/case filters to narrow down their search.
• Human Outline: The home page displays a human anatomical outline that can be used to refine their search. Choosing
an associated organ will direct the user to a listing of all projects associated with that primary site. For example, clicking
on the human brain will show only cases and projects associated with brain cancer (TCGA-GBM and TCGA-LGG). The
number of cases associated with each primary site is also displayed here and separated by project.
Each view provides a distinct representation of the same underlying set of GDC data and metadata. The GDC also provides
access to certain unharmonized data files generated by GDC-hosted projects. These files and their associated metadata are not
represented in the views above; instead they can be found in the GDC Legacy Archive.
The Projects, Exploration, Analysis and Repository pages can be accessed from the GDC Data Portal front page and from the
toolbar (see below). The annotations view is accessible from Repository view. A link to the GDC Legacy Archive is available on
the GDC Data Portal front page and in the GDC Apps menu (see below).

Toolbar
The toolbar available at the top of all pages in the GDC Data Portal provides convenient navigation links and access to
authentication and quick search.
The left portion of this toolbar provides access to the Home Page, Projects Page, Exploration Page, Analysis Page, and a
link to Repository Page:

The right portion of this toolbar provides access to quick search, the cart, and the GDC Apps menu:

The GDC Apps menu provides links to all resources provided by the GDC, including the GDC Legacy Archive.

9

Tables
Tabular listings are the primary method of representing available data in the GDC Data Portal. Tables are available in all views
and in the file cart. Users can customize each table by specifying columns, size, and sorting.
Table Sort
The sort table button is available in the top right corner of each table. To sort by a column, place a checkmark next to it and
select the preferred sort direction. If multiple columns columns are selected for sorting, data is sorted column-by-column in the
order that columns appear in the sort menu: the topmost selected column becomes the primary sorting parameter; the selected
column below it is used for secondary sort, etc.

10

Table Arrangement
The arrange columns button allows users to adjust the order of columns in the table and select which columns are displayed.
Table Size
Table size can be adjusted using the menu in the bottom left corner of the table. The menu sets the maximum number of rows to
display. If the number of entries to be displayed exceeds the maximum number of rows, then the table will be paginated, and
navigation buttons will be provided in the bottom right corner of the table to navigate between pages.
Table Export
In the Repository, Projects, and Annotations views, tables can be exported in either a JSON or TSV format. The JSON button
will export the entire table’s contents into a JSON file. The TSV button will export the current view of the table into a TSV file.

Filtering and Searching
The GDC Data Portal offers three different means of searching and filtering the available data: facet filters, quick search, and
advanced search.
Facet Filters
Facets on the left of each view (Projects, Exploration, and Repository) represent properties of the data that can be used for
filtering. Some of the available facets are project name, disease type, patient gender and age at diagnosis, and various data

11

Figure 1.1: Selecting table columns

12

Figure 1.2: Specifying table size
formats and categories. Each facet displays the name of the data property, the available values, and numbers of matching entities
for each value (files, cases, mutations, genes, annotations, or projects, depending on the context).
Below are two file facets available in the Repository view. A Data Type facet filter is applied, filtering for “Aligned Reads” files.
Multiple selections within a facet are treated as an “OR” query: e.g. “Aligned Reads” OR “Annotated Somatic Mutation”.
Selections in different facets are treated as “AND” queries: e.g. Data Type: “Aligned Reads” AND Experimental Strategy:
“RNA-Seq”.
The information displayed in each facet reflects this: in the example above, marking the “Aligned Reads” checkbox does not
change the numbers or the available values in the Data Type facet where the checkbox is found, but it does change the values
available in the Experimental Strategy facet. The Experimental Strategy facet now displays only values from files of Data Type
“Aligned Reads”.
Custom facet filters can be added in Repository View to expand the GDC Data Portal’s filtering capabilities.
Quick Search
The quick search feature allows users to find cases, files, mutations, or genes using a search query (i.e. UUID, filename, gene
name, DNA Change, project name, id, disease type or primary site). Quick search is available by clicking on the magnifier in the
right section of the toolbar (which appears on every page) or by using the search bar on the Home Page.

13

Figure 1.3: Facets (no filter applied)

Search results are displayed as the user is typing, with labels indicating the type of each search result in the list (project, case,
or file). Users users will see a brief description of the search results, which may include the UUID, submitter ID, or file name.
Clicking on a selected result or pressing enter will open a detail page with additional information.
Home Page Quick Search:

14

Toolbar Quick Search:

15

Advanced Search
Advanced Search is available in Repository View. It allows users to construct complex queries with a custom query language and
auto-complete suggestions. See Advanced Search for details.
Manage Sets
The Manage Sets button at the top of the GDC Portal stores sets of cases, genes, or mutations of interest. On this page, users
can review the sets that have been saved as well as upload new sets and delete existing sets.

Upload Sets

Clicking the Upload Set button shows options for creating Case, Gene, or Mutation sets.

Upon clicking one of the menu items, users are shown a dialog where they can enter unique identifiers (i.e. UUIDs, TCGA
Barcodes, gene symbols, mutation UUIDs, etc.) that describe the set.

16

Clicking the Submit button will add the set of items to the list of sets on the Manage Sets page.

Export Sets Users can export selected sets on this page by first clicking the checkboxes next to each set, then clicking the
Export selected button at the top of the table.

17

A text file containing the UUID of each case, gene or mutation is downloaded after clicking this button.
Review Sets

There are a few buttons in the list of sets that allows a user to get further information about each one.

• __ Items__: Clicking the link under the Items column navigates the user to the Exploration page using the set as a filter.
• Download/View: To the right of the Items column are buttons that will download the list as a tsv or open the cases in
the Repository page.
Creating Sets from GDC Portal Filters Many pages on the GDC Portal have an option called Save Sets that allows
users to save a group of cases, mutations, or genes for further analysis. After using the filtering options on the Exploration page
as an example, users can click the Save Case/Gene/Mutation Set button to save this set.

18

Chapter 2

Projects
Projects
Summary
At a high level, data in the Genomic Data Commons is organized by project. Typically, a project is a specific effort to look at
particular type(s) of cancer undertaken as part of a larger cancer research program. The GDC Data Portal allows users to access
aggregate project-level information via the Projects Page and Project Summary pages.

Projects Page
The Projects Page provides an overview of all harmonized data available in the Genomic Data Commons, organized by project.
It also provides filtering, navigation, and advanced visualization features that allow users to identify and browse projects of
interest. Users can access Projects Page from the GDC Data Portal Home page, from the Data Portal toolbar, or directly at
https://portal.gdc.cancer.gov/projects.
On the left, a panel of facets allow users to apply filters to find projects of interest. When facet filters are applied, the table
and visualizations on the right are updated to display only the matching projects. When no filters are applied, all projects are
displayed.
The right side of this page displays a few visualizations of the data (Top Mutated Genes in Selected Projects and Case Distribution
per Project). Below these graphs is a table that contains a list of projects and select details about each project, such as the
number of cases and data files. The Graph tab provides a visual representation of this information.

19

Visualizations

20

Top Mutated Cancer Genes in Selected Projects
This dynamically generated bar graph shows the 20 genes with the most mutations across all projects. The genes are filtered
by those that are part of the Cancer Gene Census and that have the following types of mutations: missense_variant,
frameshift_variant, start_lost, stop_lost, initiator_codon_variant, and stop_gained. The bars represent the frequency of each mutation and is broken down into different colored segments by project and disease type. The graphic is updated
as filters are applied for projects, programs, disease types, and data categories available in the project. Note, that due the these
filters the number of cases displayed here will be less that the total number of cases per project.
Hovering the cursor over each bar will display information about the number of cases affected by the disease type and clicking on
each bar will launch the Gene Summary Page page for the gene associated with the mutation.
Users can toggle the Y-Axis of this bar graph between a percentage or raw number of cases affected.
Case Distribution per Project
A pie chart displays the relative number of cases for each primary site (inner circle), which is further divided by project (outer
circle). Hovering the cursor over each portion of the graph will display the primary site or project with the number of associated
cases. Filtering projects at the left panel will update the pie chart.

Projects Table
The Table tab lists projects by Project ID and provides additional information about each project. If no facet filters have been
applied, the table will display all available projects; otherwise it will display only those projects that match the selected criteria.

21

The table provides links to Project Summary pages in the Project ID column. Columns with file and case counts include links to
open the corresponding files or cases in Repository Page.

Projects Graph
The Graph tab contains an interactive view of information in the Table tab. The numerical values in Case Count, File Count,
and File Size columns are represented by bars of varying length according to size. These columns are sorted independently in
descending order. Mousing over an element of the graph connects it to associated elements in other columns, including Project ID
and Primary Site

Most elements in the graph are clickable, allowing the user to open the associated cases or files in Repository Page.
Like the projects table, the graph will reflect any applied facet filters.

Facets Panel
Facets represent properties of the data that can be used for filtering. The facets panel on the left allows users to filter the projects
presented in the Table and Graph tabs as well as visualizations.

22

Users can filter by the following facets:
•
•
•
•
•
•

Project: Individual project ID
Primary Site: Anatomical site of the cancer under investigation or review
Program: Research program that the project is part of
Disease Type: Type of cancer studied
Data Category: Type of data available in the project
Experimental Strategy: Experimental strategies used for molecular characterization of the cancer

Filters can be applied by selecting values of interest in the available facets, for example “WXS” and “RNA-Seq” in the
“Experimental Strategy” facet and “Brain” in the “Primary Site” facet. When facet filters are applied, the Table and Graph tabs
are updated to display matching projects, and the banner above the tabs summarizes the applied filters. The banner allows the
user to click on filter elements to remove the associated filters, and includes a link to view the matching cases and files.

23

For information on how to use facet filters, see Getting Started.

Project Summary Page
Each project has a summary page that provides an overview of all available cases, files, and annotations available. Clicking on the
numbers in the summary table will display the corresponding data.

Three download buttons in the top right corner of the screen allow the user to download the entire project dataset, along with
the associated project metadata:
• Download Biospecimen: Downloads biospecimen metadata associated with all cases in the project.
• Download Clinical: Downloads clinical metadata about all cases in the project.
• Download Manifest: Downloads a manifest for all data files available in the project. The manifest can be used with the
GDC Data Transfer Tool to download the files.

Most Frequently Mutated Genes
The Project Summary page also reports the genes that have somatic mutations in the greatest numbers of cases in a graphical
and tabular format.

24

The top of this section contains a bar graph of the most frequently mutated genes as well as a survival plot of all the cases within
the specified project. Hovering over each bar in the plot will display information about the number of cases affected. Users may
choose to download the underlying data in JSON or TSV format or an image of the graph in SVG or PNG format by clicking the
download icon at the top of each graph.
Also at the top of this section are two links: OncoGrid and Open in Exploration. The OncoGrid button will take the user to
the OncoGrid. Open in Exploration will take the user to the Exploration page with this filters applied for the current project
selected.
Below these graphs is a tabular view of the genes affected, which includes the following information:
•
•
•
•

Symbol: The gene symbol, which links to the Gene Summary Page
Name: Full name of the gene
Cytoband: The location of the mutation on the chromosome in terms of Giemsa-stained samples.
__ Affected Cases in Project:__ The number of cases within the project that contain a mutation on this gene, which links
to the Cases tab in the Exploration Page
• __ Affected Cases Across the GDC:__ The number of cases within all the projects in the GDC that contain a mutation on
this gene. Clicking the red arrow will display the cases broken down by project
• __ Mutations:__ The number of SSMs (simple somatic mutations) detected in that gene, which links to the Mutation tab
in the Exploration Page
• Annotations: Includes a COSMIC symbol if the gene belongs to The Cancer Gene Census
25

• Survival Analysis: An icon that, when clicked, will plot the survival rate between cases in the project with mutated and
non-mutated forms of the gene

Survival Analysis
Survival analysis is used to analyze the occurrence of event data over time. In the GDC, survival analysis is performed on the
mortality of the cases. Survival analysis requires:
• Data on the time to a particular event (days to death or last follow up)
– Fields: diagnoses.days_to_death and diagnoses.days_to_last_follow_up
• Information on whether the event has occurred (alive/deceased)
– Fields: diagnoses.vital_status
• Data split into different categories or groups (i.e. gender, etc.)
– Fields: demographic.gender
The survival analysis in the GDC uses a Kaplan-Meier estimator:

Where:
• S(ti) is the estimated survival probability for any particular one of the t time periods
• ni is the number of subjects at risk at the beginning of time period ti
• and di is the number of subjects who die during time period ti
The table below is an example data set to calculate survival for a set of seven cases:

The calculated cumulated survival probability can be plotted against the interval to obtain a survival plot like the one shown
below.

26

Most Frequent Mutations
At the top of this section is a survival plot of all the cases within the specified exploration page filters.

27

A table is displayed below that lists information about each mutation:
• Mutation ID: A UUID for the mutation assigned by the GDC, when clicked will bring a user to the Mutation Summary
Page
• DNA Change: The chromosome and starting coordinates of the mutation are displayed along with the nucleotide
differences between the reference and tumor allele
• Type: A general classification of the mutation
• Consequences: The effects the mutation has on the gene coding for a protein (i.e. synonymous, missense, non-coding
transcript). A link to the Gene Summary Page for the gene affected by the mutation is included
• __ Affected Cases in Project:__ The number of affected cases in the project expressed as a fraction and percentage
• __ Affected Cases in Across the GDC:__ The number of affected cases, expressed as number across all projects. Choosing
the arrow next to the percentage will display a breakdown of each affected project
• Impact: A subjective classification of the severity of the variant consequence. This determined using Ensembl VEP,
PolyPhen, and SIFT. The categories are outlined here.
• Survival Analysis: An icon that when clicked, will plot the survival rate between the gene’s mutated and non-mutated
cases

Most Affected Cases
The final section of the Project Summary page is a display of the top 20 cases in a specified project, with the greatest number of
affected genes.

28

Below the bar graph is a table contains information about these cases:
•
•
•
•
•
•
•
•
•
•
•

UUID: The UUID of the case, which links to the Case Summary Page
Submitter ID: The Submitter ID of the case (i.e. the TCGA Barcode)
Site: The anatomical location of the site affected
Gender: Text designations that identify gender. Gender is described as the assemblage of properties that distinguish
people on the basis of their societal roles
Age at Diagnosis: Age at the time of diagnosis expressed in number of days since birth
Stage: The extent of a cancer in the body. Staging is usually based on the size of the tumor, whether lymph nodes contain
cancer, and whether the cancer has spread from the original site to other parts of the body. The accepted values for
tumor_stage depend on the tumor site, type, and accepted staging system
Survival (days): The number of days until death
Last Follow Up (days): Time interval from the date of last follow up to the date of initial pathologic diagnosis, represented
as a calculated number of days
Available Files per Data Category: Five columns displaying the number of files available in each of the five data
categories. These link to the files for the specific case.
__ Mutations:__ The number of mutations for the case
__ Genes:__ The number of genes affected by mutations for the case

29

Chapter 3

Exploration
Exploration
The Exploration page allows users to explore data in the GDC using advanced filters/facets, which includes those on a gene and
mutation level. Users choose filters on specific Cases, Genes, and/or Mutations on the left of this page and then can visualize
these results on the right. The Gene/Mutation data for these visualizations comes from the Open-Access MAF files on the GDC
Portal.

Filters / Facets
On the left of this page, users can create advanced filters to narrow down results to create synthetic cohorts.

30

Case Filters
The first tab of filters is for cases in the GDC.

31

32

These criteria limit the results only to specific cases within the GDC. The default filters available are:
•
•
•
•
•
•
•
•
•
•
•
•

Case: Specify individual cases using submitter ID (barcode), UUID, or list of Cases (‘Case Set’)
Case Submitter ID: Search for cases using a part (prefix) of the submitter ID (barcode).
Primary Site: Anatomical site of the cancer under investigation or review.
Program: A cancer research program, typically consisting of multiple focused projects.
Project: A cancer research project, typically part of a larger cancer research program.
Disease Type: Type of cancer studied.
Gender: Gender of the patient.
Age at Diagnosis: Patient age at the time of diagnosis.
Vital Status: Indicator of whether the patient was living or deceased at the date of last contact.
Days to Death: Number of days from date of diagnosis to death of the patient.
Race: Race of the patient.
Ethnicity: Ethnicity of the patient.

In addition to the defaults, users can add additional case filters by clicking on the link titled ‘Add a Case Filter’
Upload Case Set
In the Cases filters panel, instead of supplying cases one-by-one, users can supply a list of cases. Clicking on the Upload Case Set
button will launch a dialog as shown below, where users can supply a list of cases or upload a comma-separated text file of cases.

After supplying a list of cases, a table below will appear which indicates whether the case was found.

33

Clicking on Submit will filter the results in the Exploration Page by those cases.

34

Gene Filters
The second tab of filters is for genes affected by mutations in the GDC.

35

The second tab of filters are for specific genes. Users can filter by:
• Gene - Entering in a specific Gene Symbol, ID, or list of Genes (‘Gene Set’)
• Biotype - Classification of the type of gene according to Ensembl. The biotypes can be grouped into protein coding,
pseudogene, long noncoding and short noncoding. Examples of biotypes in each group are as follows:
– Protein coding: IGC gene, IGD gene, IG gene, IGJ gene, IGLV gene, IGM gene, IGV gene, IGZ gene, nonsense
mediated decay, nontranslating CDS, non stop decay, polymorphic pseudogene, TRC gene, TRD gene, TRJ gene.
– Pseudogene: disrupted domain, IGC pseudogene, IGJ pseudogene, IG pseudogene, IGV pseudogene, processed
pseudogene, transcribed processed pseudogene, transcribed unitary pseudogene, transcribed unprocessed pseudogene,
translated processed pseudogene, TRJ pseudogene, unprocessed pseudogene
– Long noncoding: 3prime overlapping ncrna, ambiguous orf, antisense, antisense RNA, lincRNA, ncrna host, processed
transcript, sense intronic, sense overlapping
– Short noncoding: miRNA, miRNA_pseudogene, miscRNA, miscRNA pseudogene, Mt rRNA, Mt tRNA, rRNA,
scRNA, snlRNA, snoRNA, snRNA, tRNA, tRNA_pseudogene
• Is Cancer Gene Census - Whether or not a gene is part of The Cancer Gene Census
36

Upload Gene Set
In the Genes filters panel, instead of supplying genes one-by-one, users can supply a list of genes. Clicking on the Upload Gene Set
button will launch a dialog as shown below, where users can supply a list of genes or upload a comma-separated text file of genes.

After supplying a list of genes, a table below will appear which indicates whether the gene was found.

Clicking on Submit will filter the results in the Exploration Page by those genes.

Mutation Filters
The final tab of filters is for specific mutations.

37

38

Users can filter by:
• Mutation - Unique ID for that mutation. Users can use the following:
–
–
–
–

UUID - c7c0aeaa-29ed-5a30-a9b6-395ba4133c63
DNA Change - chr12:g.121804752delC
COSMIC ID - COSM202522
List of any mutation UUIDs or DNA Change id’s (‘Mutation Set’)

• Consequence Type - Consequence type of this variation; sequence ontology terms
• Impact - A subjective classification of the severity of the variant consequence. This information comes from the Ensembl
VEP.
• Type - A general classification of the mutation
• Variant Caller - The variant caller used to identify the mutation
• COSMIC ID - The identifier of the gene or mutation maintained in COSMIC, the Catalogue Of Somatic Mutations In
Cancer
• dbSNP rs ID - The reference SNP identifier maintained in dbSNP
Upload Mutation Set
In the Mutations filters panel, instead of supplying mutation id’s one-by-one, users can supply a list of mutations. Clicking on
the Upload Mutation Set button will launch a dialog as shown below, where users can supply a list of mutations or upload a
comma-separated text file of mutations.

After supplying a list of mutations, a table below will appear which indicates whether the mutation was found.

39

Clicking on Submit will filter the results in the Exploration Page by those mutations.

40

Results
As users add filters to the data on the Exploration Page, the Results section will automatically be updated. Results are divided
into different tabs: Cases, Genes, Mutations, and OncoGrid.
To illustrate these tabs, Case, Gene, and Mutation filters have been chosen ( Genes in the Cancer Gene Census, that have HIGH
Impact for the TCGA-BRCA project) and a description of what each tab displays follows.
Cases
The Cases tab gives an overview of all the cases/patients who correspond to the filters chosen (Cohort).

41

The top of this section contains a few pie graphs with categorical information regarding the Primary Site, Project, Disease Type,
Gender, and Vital Status.
Below these pie charts is a tabular view of cases (which can be exported, sorted and saved using the buttons on the right), that
includes the following information:
•
•
•
•
•
•

Case ID (Submitter ID): The Case ID / submitter ID of that case/patient (i.e. TCGA Barcode)
Project: The study name for the project for which the case belongs
Primary Site: The primary site of the cancer/project
Gender: The gender of the case
Files: The total number of files available for that case
Available Files per Data Category: Five columns displaying the number of files available in each of the five data
categories. These link to the files for the specific case.
• __ Mutations:__ The number of SSMs (simple somatic mutations) detected in that case
• __ Genes:__ The number of genes affected by mutations in that case

Note: By default, the Case UUID is not displayed. You can display the UUID of the case, but clicking on the icon with 3 parallel
lines, and choose to display the Case UUID

42

Genes
The Genes tab will give an overview of all the genes that match the criteria of the filters (Cohort).

The top of this section contains a survival plot of all the cases within the specified Exploration page search, in addition to a bar
graph of the most frequently mutated genes. Hovering over each bar in the plot will display information about the percentage of
43

cases affected. Users may choose to download the underlying data in JSON or TSV format or an image of the graph in SVG or
PNG format by clicking the download icon at the top of each graph.
Below these graphs is a tabular view of the genes affected, which includes the following information:
•
•
•
•
•
•

Symbol: The gene symbol, which links to the Gene Summary Page
Name: Full name of the gene
Cytoband: The location of the mutation on the chromosome in terms of Giemsa-stained samples.
Type: The type of gene
__ Affected Cases in Cohort:__ The number of cases affected in the Cohort
__ Affected Cases Across all Projects:__ The number of cases within all the projects in the GDC that contain a mutation
on this gene. Clicking the red arrow will display the cases broken down by project
• __ Mutations:__ The number of SSMs (simple somatic mutations) detected in that gene
• Annotations: Includes a COSMIC symbol if the gene belongs to The Cancer Gene Census
• Survival Analysis: An icon that, when clicked, will plot the survival rate between cases in the project with mutated and
non-mutated forms of the gene

Mutations
The Mutations tab will give an overview of all the mutations who match the criteria of the filters (Cohort).

44

At the top of this tab is a survival plot of all the cases within the specified exploration page filters.
A table is displayed below that lists information about each mutation:
• DNA Change: The chromosome and starting coordinates of the mutation are displayed along with the nucleotide
differences between the reference and tumor allele
• Type: A general classification of the mutation
• Consequences: The effects the mutation has on the gene coding for a protein (i.e. synonymous, missense, non-coding
transcript). A link to the Gene Summary Page for the gene affected by the mutation is included
• __ Affected Cases in Cohort:__ The number of affected cases in the Cohort as a fraction and as a percentage
• __ Affected Cases in Across all Projects:__ The number of affected cases, expressed as number across all projects. This
information comes from the Ensembl VEP. Choosing the arrow next to the percentage will display a breakdown of each
affected project
45

• Impact (VEP): A subjective classification of the severity of the variant consequence. The categories are:
– HIGH (H): The variant is assumed to have high (disruptive) impact in the protein, probably causing protein
truncation, loss of function, or triggering nonsense mediated decay
– MODERATE (M): A non-disruptive variant that might change protein effectiveness
– LOW (L): Assumed to be mostly harmless or unlikely to change protein behavior
– MODIFIER (MO): Usually non-coding variants or variants affecting non-coding genes, where predictions are difficult
or there is no evidence of impact
• Survival Analysis: An icon that when clicked, will plot the survival rate between the gene’s mutated and non-mutated
cases
Note: By default, the Mutation UUID is not displayed. You can display the UUID of the case, but clicking on the icon with 3
parallel lines, and choose to display the Mutation UUID
OncoGrid
The Exploration page includes an OncoGrid plot of the cases with the most mutations, for the top 50 mutated genes affected by
high impact mutations. Genes displayed on the left of the grid (Y-axis) correspond to individual cases on the bottom of the grid
(X-axis).

46

The grid is color-coded with a legend at the top left which describes what type of mutation consequence is observed for each
gene/case combination. Clinical information and the available data for each case are available at the bottom of the grid.
The right side of the grid displays additional information about the genes:
• Gene Sets: Describes whether a gene is part of The Cancer Gene Census. (The Cancer Gene Census is an ongoing effort
to catalogue those genes for which mutations have been causally implicated in cancer)
• GDC: Identifies all cases in the GDC affected with a mutation in this gene

47

OncoGrid Options
To facilitate readability and comparisons, drag-and-drop can be used to reorder the gene rows. Double clicking a row in the "
Cases Affected" bar at the right side of the graphic launches the respective Gene Summary Page page. Hovering over a cell will
display information about the mutation such as its ID, affected case, and biological consequence. Clicking on the cell will bring
the user to the respective Mutation Summary page.
A tool bar at the top right of the graphic allows the user to export the data as a JSON object, PNG image, or SVG image. Seven
buttons are available in this toolbar:
• Download: Users can choose to export the contents either to a static image file (PNG or SVG format) or the underlying
data in JSON format
• Reload Grid: Sets all OncoGrid rows, columns, and zoom levels back to their initial positions
• Cluster Data: Clusters the rows and columns to place mutated genes with the same cases and cases with the same
mutated genes together
• Toggle Heatmap: The view can be toggled between cells representing mutation consequences or number of mutations in
each gene
• Toggle Gridlines: Turn the gridlines on and off
• Toggle Crosshairs: Turns crosshairs on, so that users can zoom into specific sections of the OncoGrid
• Fullscreen: Turns Fullscreen mode on/off

File Navigation
After utilizing the Exploration Page to narrow down a specific cohort, users can find the specific files that relate to this group by
clicking on the View Files in Repository button as shown in the image below.

Clicking this button will navigate the users to the Repository Page, filtered by the cases within the cohort.
48

The filters chosen on the Exploration page are displayed as an input set on the Repository page. Additional filters may be
added on top of this input set, but the original set cannot be modified and instead must be created from scratch again.

49

Chapter 4

Repository
Repository
Summary
The Repository Page is the primary method of accessing data in the GDC Data Portal. It provides an overview of all cases
and files available in the GDC and offers users a variety of filters for identifying and browsing cases and files of interest.
Users can access the Repository Page from the GDC Data Portal front page, from the Data Portal toolbar, or directly at
https://portal.gdc.cancer.gov/repository.

Filters / Facets
On the left, a panel of data facets allows users to filter cases and files using a variety of criteria. If facet filters are applied, the
tabs on the right will display information about matching cases and files. If no filters are applied, the tabs on the right will
display information about all available data.
On the right, two tabs contain information about available data:
• Files tab provides a list of files, select information about each file, and links to individual file detail pages.
• Cases tab provides a list of cases, select information about each case, and links to individual case summary pages
The banner above the tabs on the right displays any active facet filters and provides access to advanced search.
The top of the Repository Page contains a few summary pie charts for Primary Sites, Projects, Disease Type, Gender, and Vital
Status. These reflect all available data or, if facet filters are applied, only the data that matches the filters. Clicking on a specific
slice in a pie chart, or on a number in a table, applies corresponding facet filters.

50

Facets Panel
Facets represent properties of the data that can be used for filtering. The facets panel on the left allows users to filter the cases
and files presented in the tabs on the right.
The facets panel is divided into two tabs, with the Files tab containing facets pertaining to data files and experimental strategies,
while the Cases tab containing facets pertaining to the cases and biospecimen information. Users can apply filters in both tabs
simultaneously. The applied filters will be displayed in the banner above the tabs on the right, with the option to open the filter
in Advanced Search to further refine the query.
The Getting Started section provides instructions on using facet filters. In the following example, a filter from the Cases tab
(“primary site”) and filters from the Files tab (“data category”, “experimental strategy”) are both applied:

51

The default set of facets is listed below.
Files facets tab:
• File: Specify individual files using filename or UUID.
• Data Category: A high-level data file category, such as “Raw Sequencing Data” or “Transcriptome Profiling”.
• Data Type: Data file type, such as “Aligned Reads” or “Gene Expression Quantification”. Data Type is more granular
than Data Category.
• Experimental Strategy: Experimental strategies used for molecular characterization of the cancer.
• Workflow Type: Bioinformatics workflow used to generate or harmonize the data file.
• Data Format: Format of the data file.
• Platform: Technological platform on which experimental data was produced.
• Access Level: Indicator of whether access to the data file is open or controlled.
Cases facets tab:
•
•
•
•
•
•
•
•
•
•
•
•

Case: Specify individual cases using submitter ID (barcode) or UUID.
Case Submitter ID Prefix: Search for cases using a part (prefix) of the submitter ID (barcode).
Primary Site: Anatomical site of the cancer under investigation or review.
Cancer Program: A cancer research program, typically consisting of multiple focused projects.
Project: A cancer research project, typically part of a larger cancer research program.
Disease Type: Type of cancer studied.
Gender: Gender of the patient.
Age at Diagnosis: Patient age at the time of diagnosis.
Vital Status: Indicator of whether the patient was living or deceased at the date of last contact.
Days to Death: Number of days from date of diagnosis to death of the patient.
Race: Race of the patient.
Ethnicity: Ethnicity of the patient.

52

Adding Custom Facets
The Repository Page provides access to additional data facets beyond those listed above. Facets corresponding to additional
properties listed in the GDC Data Dictionary can be added using the “add a filter” links available at the top of the Cases and
Files facet tabs:

The links open a search window that allows the user to find an additional facet by name or description. Not all facets have values
available for filtering; checking the “Only show fields with values” checkbox will limit the search results to only those that do.
Selecting a facet from the list of search results below the search box will add it to the facets panel.

Newly added facets will show up at the top of the facets panel and can be removed individually by clicking on the red cross to
the right of the facet name. The default set of facets can be restored by clicking “Reset”.

53

## Results

Files List
The Files tab on the right provides a list of available files and select information about each file. If facet filters are applied, the
list includes only matching files. Otherwise, the list includes all data files available in the GDC Data Portal.

54

The File Name column includes links to file detail pages where the user can learn more about each file.
Users can add individual file(s) to the file cart using the cart button next to each file. Alternatively, all files that match the
current facet filters can be added to the cart using the menu in the top left corner of the table:

55

Cases List
The Cases tab on the right provides a list of available cases and select information about each case. If facet filters are applied, the
list includes only matching cases. Otherwise, the list includes all cases available in the GDC Data Portal.

56

The list includes links to case summary pages in the Case UUID column, the Submitter ID (i.e. TCGA Barcode), and counts of
the available file types for each case. Clicking on a count will apply facet filters to display the corresponding files.
The list also includes a shopping cart button, allowing the user to add all files associated with a case to the file cart for downloading
at a later time:

Navigation
After utilizing the Repository Page to narrow down a specific set of cases, users can continue to explore the mutations and genes
affected by these cases by clicking the View Files in Repository button as shown in the image below.

57

Clicking this button will navigate the users to the Exploration Page, filtered by the cases within the cohort.

Case Summary Page
The Case Summary page displays case details including the project and disease information, data files that are available for that
case, and the experimental strategies employed. A button in the top-right corner of the page allows the user to add all files
associated with the case to the file cart.

58

Clinical and Biospecimen Information
The page also provides clinical and biospecimen information about that case. Links to export clinical and biospecimen information
in JSON format are provided.

59

For clinical records that support multiple records of the same type (Diagnoses, Family Histories, or Exposures), a UUID of the
record is provided on the left hand side of the corresponding tab, allowing the user to select the entry of interest.

Biospecimen Search
A search filter just below the biospecimen section can be used to find and filter biospecimen data. The wildcard search will
highlight entities in the tree that match the characters typed. This will search both the case submitter ID, as well as the additional
metadata for each entity. For example, searching ‘Primary Tumor’ will highlight samples that match that type.

60

Most Frequent Somatic Mutations
The case entity page also lists the mutations found in that particular case.

The table lists the following information for each mutation
61

• DNA Change: The chromosome and starting coordinates of the mutation are displayed along with the nucleotide
differences between the reference and tumor allele
• Type: A general classification of the mutation
• Consequences: The effects the mutation has on the gene coding for a protein (i.e. synonymous, missense, non-coding
transcript)
• __ Affected Cases in Project:__ The number of affected cases, expressed as number across all mutations within the Project
• __ Affected Cases Across GDC:__ The number of affected cases, expressed as number across all projects. Choosing the
arrow next to the percentage will expand the selection with a breakdown of each affected project
• Impact (VEP): A subjective classification of the severity of the variant consequence. This information comes from the
Ensembl VEP. The categories are:
• HIGH (H): The variant is assumed to have high (disruptive) impact in the protein, probably causing protein truncation,
loss of function or triggering nonsense mediated decay
• MODERATE (M): A non-disruptive variant that might change protein effectiveness
• LOW (L): Assumed to be mostly harmless or unlikely to change protein behavior
• MODIFIER (MO): Usually non-coding variants or variants affecting non-coding genes, where predictions are difficult or
there is no evidence of impact
Clicking on the Open in Exploration button at the top right of this section will navigate the user to the Exploration page,
filtered on this case.

File Summary Page
The File Summary page provides information a data file, including file properties like size, md5 checksum, and data format;
information on the type of data included; links to the associated case and biospecimen; and information about how the data file
was generated or processed.
The page also includes buttons to download the file, add it to the file cart, or (for BAM files) utilize the BAM slicing function.

62

In the lower section of the screen, the following tables provide more details about the file and its characteristics:
•
•
•
•
•

Associated Cases / Biospecimen: List of Cases or biospecimen the file is directly attached to.
Analysis and Reference Genome: Information on the workflow and reference genome used for file generation.
Read Groups: Information on the read groups associated with the file.
Metadata Files: Experiment metadata, run metadata and analysis metadata associated with the file
Downstream Analysis Files: List of downstream analysis files generated by the file

Note: The Legacy Archive will not display “Workflow, Reference Genome and Read Groups” sections (these sections are applicable
to the GDC harmonization pipeline only). However it may provide information on Archives and metadata files like MAGE-TABs
and SRA XMLs. For more information, please refer to the section Legacy Archive.

BAM Slicing
BAM file detail pages have a “BAM Slicing” button. This function allows the user to specify a region of a BAM file for download.
Clicking on it will open the BAM slicing window:

During preparation of the slice, the icon on the BAM Slicing button will be spinning, and the file will be offered for download to
the user as soon as ready.

63

Chapter 5

Genes and Mutations
Gene and Mutation Summary Pages
Many parts of the GDC website contain links to Gene and Mutation summary pages. These pages display information about
specific genes and mutations, along with visualizations and data showcasing the relationship between themselves, the projects,
and cases within the GDC. The gene and mutation data that is visualized on these pages are produced from the Open-Access
MAF files available for download on the GDC Portal.

Gene Summary Page
Gene Summary Pages describe each gene with mutation data and provides results related to the analyses that are performed on
these genes.

Summary
The summary section of the gene page contains the following information:

•
•
•
•

Symbol: The gene symbol
Name: Full name of the gene
Synonyms: Synonyms of the gene name or symbol, if available
Type: A broad classification of the gene
64

•
•
•
•

Location: The chromosome on which the gene is located and its coordinates
Strand: If the gene is located on the forward (+) or reverse (-) strand
Description: A description of gene function and downstream consequences of gene alteration
Annotation: A notation/link that states whether the gene is part of The Cancer Gene Census

External References
A list with links that lead to external databases with additional information about each gene is displayed here. These external
databases include: Entrez, Uniprot, Hugo Gene Nomenclature Committee, Online Mendelian Inheritance in Man, and Ensembl.

Cancer Distribution
A table and bar graph show how many cases are affected by mutations within the gene as a ratio and percentage. Each row/bar
represents the number of cases for each project. The final column in the table lists the number of unique mutations observed on
the gene for each project.

65

Protein Viewer

Mutations and their frequency across cases are mapped to a graphical visualization of protein-coding regions with a lollipop
plot. Pfam domains are highlighted along the x-axis to assign functionality to specific protein-coding regions. The bottom track
represents a view of the full gene length. Different transcripts can be selected by using the drop-down menu above the plot.
The panel to the right of the plot allows the plot to be filtered by mutation consequences or impact. The plot will dynamically
change as filters are applied. Mutation consequence and impact is denoted in the plot by color.
Note: The impact filter on this panel will not display the annotations for alternate transcripts.
The plot can be viewed at different zoom levels by clicking and dragging across the x-axis, clicking and dragging across the
bottom track, or double clicking the pfam domain IDs. The Reset button can be used to bring the zoom level back to its original
position. The plot can also be exported as a PNG image, SVG image or as JSON formatted text by choosing the Download
button above the plot.

Most Frequent Mutations
The 20 most frequent mutations in the gene are displayed as a bar graph that indicates the number of cases that share each
mutation.

A table is displayed below that lists information about each mutation including:
66

• DNA Change: The chromosome and starting coordinates of the mutation are displayed along with the nucleotide
differences between the reference and tumor allele
• Type: A general classification of the mutation
• Consequences: The effects the mutation has on the gene coding for a protein (i.e. synonymous, missense, non-coding
transcript)
• __ Affected Cases in Gene:__ The number of affected cases, expressed as number across all mutations within the Gene
• __ Affected Cases Across GDC:__ The number of affected cases, expressed as number across all projects. Choosing the
arrow next to the percentage will expand the selection with a breakdown of each affected project
• Impact: A subjective classification of the severity of the variant consequence. This determined using Ensembl VEP,
PolyPhen, and SIFT. The categories are outlined here.
Note: The Mutation UUID can be displayed in this table by selecting it from the drop-down represented by three parallel lines
Clicking the Open in Exploration button will navigate the user to the Exploration page, showing the same results in the table
(mutations filtered by the gene).

Mutation Summary Page
The Mutation Summary Page contains information about one somatic mutation and how it affects the associated gene. Each
mutation is identified by its chromosomal position and nucleotide-level change.

Summary

•
•
•
•
•
•

ID: A unique identifier (UUID) for this mutation
DNA Change: Denotes the chromosome number, position, and nucleotide change of the mutation
Type: A broad categorization of the mutation
Reference Genome Assembly: The reference genome in which the chromosomal position refers to
Allele in the Reference Assembly: The nucleotide(s) that compose the site in the reference assembly
Functional Impact: A subjective classification of the severity of the variant consequence.

External References
A separate panel contains links to databases that contain information about the specific mutation. These include dbSNP and
COSMIC.

Consequences
The consequences of the mutation are displayed in a table. The set of consequence terms, defined by the Sequence Ontology.

67

The fields that describe each consequence are listed below:
•
•
•
•
•
•

Gene: The symbol for the affected gene
AA Change: Details on the amino acid change, including compounds and position, if applicable
Consequence: The biological consequence of each mutation
Coding DNA Change: The specific nucleotide change and position of the mutation within the gene
Strand: If the gene is located on the forward (+) or reverse (-) strand
Transcript(s): The transcript(s) affected by the mutation. Each contains a link to the Ensembl entry for the transcript

Cancer Distribution
A table and bar graph shows how many cases are affected by the particular mutation. Each row/bar represents the number of
cases for each project.

68

The table contains the following fields:
•
•
•
•

Project ID: The ID for a specific project
Disease Type: The disease associated with the project
Site: The anatomical site affected by the disease
__ Affected Cases__: The number of affected cases and total number of cases displayed as a fraction and percentage

69

Protein Viewer

The protein viewer displays a plot representing the position of mutations along the polypeptide chain. The y-axis represents the
number of cases that exhibit each mutation, whereas the x-axis represents the polypeptide chain sequence. Pfam domains that
were identified along the polypeptide chain are identified with colored rectangles labeled with pfam IDs. See the Gene Summary
Page for additional details about the protein viewer.
The panel to the right of the plot allows the plot to be filtered by mutation consequences or impact. The plot will dynamically
change as filters are applied. Mutation consequence and impact is denoted in the plot by color.
Note: The impact filter on this panel will not display the annotations for alternate transcripts.
The plot can be viewed at different zoom levels by clicking and dragging across the x-axis, clicking and dragging across the
bottom track, or double clicking the pfam domain IDs. The Reset button can be used to bring the zoom level back to its original
position. The plot can also be exported as a PNG image, SVG image or as JSON formatted text by choosing the Download
button above the plot.

70

Chapter 6

Custom Set Analysis
Custom Set Analysis
In addition to the Exploration page, the GDC Data Portal also has features used to save and compare sets of cases, genes, and
mutations. These sets can either be generated with existing filters (e.g. males with lung cancer) or through custom selection
(e.g. a user-generated list of case IDs).
Note that saving a set only saves the type of entity included in the set. For example, a saved case set will not include filters that
were applied to genes or mutations. Please be aware that your custom sets are deleted during each new GDC data release. You
can export them and re-upload them in the “Manage Sets” link at the top right of the Portal.

Generating a Cohort for Analysis
Cohort sets are completely customizable and can be generated for cases, genes, or mutations using the following methods:
Upload ID Set: This feature is available in the “Manage Sets” link at the top right of the Portal. Choose “Upload Set” and
then select whether the set comprises cases, genes, or mutations. A set of IDs (IDs* or UUIDs) can then be uploaded in a text file
or copied and pasted into the list of identifiers field along with a name identifying the set. Once the list of identifiers is uploaded,
they are validated and grouped according to whether the identifier matched an existing GDC ID or did not match (“Unmatched”).

* This is referred to as a submitter_id in the GDC API, which is a non-UUID identifier such as a TCGA barcode.
Apply Filters in Exploration: Sets can be assembled using the existing filters in the Exploration page. They can be saved by
choosing the “Save/Edit Case Set” button under the pie charts for case sets. This will prompt a decision to:
• Save as new case set
• Add to existing case set
• Remove from existing case set
Similarly, gene and mutation filters can be applied and saved in the Exploration page in the Genes and Mutations tab, respectively.

71

Analysis Page
Clicking on the Analysis button in the top toolbar will launch the Analysis Page which displays the various options available for
comparing saved sets.

There are two tabs on this page:
• Launch Analysis: Where users can select either to do Set Operations or Cohort Comparison
• Results: Where users can view the results of current or previous set analyses

Analysis Page: Set Operations
Up to three sets of the same set type can be compared and exported based on complex overlapping subsets. The features of this
page include:

72

• Venn Diagram: Visually displays the overlapping items included within the three sets. Subsets based on overlap can be
selected by clicking one or many sections of the Venn diagram. As sections of the Venn Diagram become highlighted in
blue, their corresponding row in the overlap table becomes highlighted.
• Summary Table: Displays the alias, item type, and name for each set included in this analysis
• Overlap Table: Displays the number of overlapping items with set operations rather than a visual diagram. Subsets can
be selected by checking boxes in the “Select” column, which will highlight the corresponding section of the Venn Diagram.
As rows are selected, the “Union of selected sets” row is populated. Each row has an option to save the subset as a new set,
export the set as a TSV, or view files in the repository. The links that correspond to the number of items in each row will
open the cohort in the Exploration page.

Analysis Tab: Cohort Comparison
The “Cohort Comparison” analysis displays a series of graphs and tables that demonstrate the similarities and differences between
two case sets. The following features are displayed for each two sets:
• A key detailing the number of cases in each cohort and the color that represents each (blue/gold)
• A Venn diagram, which shows the overlap between the two cohorts. The Venn diagram can be opened in a ‘Set Operations’
tab by choosing “Open venn diagram in new tab”
• A selectable survival plot that compares both sets with information about the percentage of represented cases

73

• A breakdown of each cohort by selectable clinical facets with a bar graph and table. Facets include vital_status, gender,
race, ethnicity, and age_at_diagnosis. A p-value (if it can be calculated from the data) that demonstrates whether the
statuses are proportionally represented is displayed for the vital_status, gender, and ethnicity facets.

Analysis Page: Results
The results of the previous analyses are displayed on this page.
74

Each tab at the left side of the page is labeled according to the analysis type and the date that the analysis was performed and
can be reviewed as long as it is present. The “Delete All” button will remove all of the previous analyses.

75

Chapter 7

Annotations
Annotations
Annotations are notes added to individual cases, samples or files.

Annotations View
The Annotations View provides an overview of the available annotations and allows users to browse and filter the annotations
based on a number of annotation properties (facets), such as the type of entity the annotation is attached to or the annotation
category.
The view presents a list of annotations in tabular format on the right, and a facet panel on the left that allows users to filter the
annotations displayed in the table. If facet filters are applied, the tabs on the right will display only the matching annotations. If
no filters are applied, the tabs on the right will display information about all available data.
Clicking on an annotation ID in the annotations list will take the user to the Annotation Detail Page.

76

Facets Panel
The following facets are available to search for annotations:
•
•
•
•
•
•
•
•
•

Annotation ID: Seach using annotation ID
Entity ID: Seach using entity ID
Case UUID: Seach using case UUID
Primary Site: Anatomical site of the cancer
Project: A cancer research project, typically part of a larger cancer research program
Entity Type: The type of entity the annotation is associated with: Patient, Sample, Portion, Slide, Analyte, Aliquot
Annotation Category: Search by annotation category.
Annotation Created: Search for annotations by date of creation.
Annotation Classification: Search by annotation classification.

Annotation Categories and Classification
For more details about categories and classifications please refer to the TCGA Annotations page on NCI Wiki.

Annotation Detail Page
The annotation entity page provides more details about a specific annotation. It is available by clicking on an annotation ID in
Annotations View.

77

78

Chapter 8

Advanced Search
Advanced Search
Only available in the Repository view, the Advanced Search page offers complex query building capabilities to identify specific set
of cases and files.

Overview: GQL
Advanced search allows, via Genomic Query Language (GQL), to use structured queries to search for files and cases.

79

A simple query in GQL (also known as a ‘clause’) consists of a field, followed by an operator, followed by one or more values.
For example, the simple query cases.primary_site = Brain will find all cases for projects in which the primary site is Brain:

Note that it is not possible to compare two fields (e.g. disease_type = project.name).
Note: GQL is not a database query language. For example, GQL does not have a “SELECT” statement.

Switching between Advanced Search and Facet Filters
When accessing Advanced Search from Repository View, a query created using facet filters in Repository View will be automatically
translated to an Advanced Search GQL Query.
A query created in Advanced Search is not translated back to facet filters. Clicking on “Back to Facet Search” will return the
user to Data View and reset the filters.
80

Using the Advanced Search
When opening the advanced search page (via the Repository view), the search field will be automatically populated with facets
filters already applied (if any).
This default query can be removed by pressing “Reset”.
Once the query has been entered and is identified as a “Valid Query”, click on “Search” to run your query.

Auto-complete
As a query is being written, the GDC Data Portal will analyze the context and offer a list of auto-complete suggestions.
Auto-complete suggests both fields and values as described below.
Field Auto-complete
The list of auto-complete suggestions includes all available fields matching the user text input. The user has to scroll down to see
more fields in the dropdown:

Value Auto-complete
The list of auto-complete suggestions includes top 100 values that match the user text input. The user has to scroll down to see
more values in the dropdown.
The value auto-complete is not aware of the general context of the query, the system will display all available values in GDC for
the selected field. It means the query could return 0 results depending of other filters.

81

Note: Quotes are automatically added to the value if it contains spaces.

Setting Precedence of Operators
You can use parentheses in complex GQL statements to enforce the precedence of operators.
For example, if you want to find all the open files in TCGA program as well as the files in TARGET program, you can use
parentheses to enforce the precedence of the boolean operators in your query, i.e.:
1

(files.access = open and cases.project.program.name = TCGA) or cases.project.program.name = TARGET
Note: Without parentheses, the statement will be evaluated left-to-right.

Keywords
A GQL keyword is a word that joins two or more clauses together to form a complex GQL query.
List of Keywords:
• AND
• OR
Note: parentheses can be used to control the order in which clauses are executed.

AND Keyword
Used to combine multiple clauses, allowing you to refine your search.
Examples:
• Find all open files in breast cancer
cases.project.primary_site = Breast and files.access = open
• Find all open files in breast cancer and data type is copy number variation
cases.project.primary_site = Breast and files.access = open and files.data_type = “Copy number variation”
82

OR Keyword
Used to combine multiple clauses, allowing you to expand your search.
Note: IN keyword can be an alternative to OR and result in simplified queries.
Examples:
• Find all files that are raw sequencing data or raw microarray data:
files.data_type = “Raw microarray data” or files.data_type = “Raw sequencing data”
• Find all files where donors are male or vital status is alive:
cases.demographic.gender = male or cases.diagnoses.vital_status = alive

Operators
An operator in GQL is one or more symbols or words comparing the value of a field on its left with one or more values on its
right, such that only true results are retrieved by the clause.

List of Operators and Query format
Operator

Description

=

Field EQUAL Value (String or Number)

!=

Field NOT EQUAL Value (String or Number)

<

Field LOWER THAN Value (Number or Date)

<=

Field LOWER THAN OR EQUAL Value (Number or Date)

>

Field GREATER THAN Value (Number or Date)

>=

Field GREATER THAN OR EQUAL Value (Number or Date)

IN

Field IN [Value 1, Value 2]

EXCLUDE

Field EXCLUDE [Value 1, Value 2]

IS MISSING

Field IS MISSING

NOT MISSING

Field NOT MISSING

“=” operator - EQUAL
The “=” operator is used to search for files where the value of the specified field exactly matches the specified value.
Examples:
• Find all files that are gene expression:
files.data_type = “Gene expression”
• Find all cases whose gender is female:
cases.demographic.gender = female

“!=” operator - NOT EQUAL
The “!=” operator is used to search for files where the value of the specified field does not match the specified value.
83

The “!=” operator will not match a field that has no value (i.e. a field that is empty). For example, ‘gender != male’ will only
match cases who have a gender and the gender is not male. To find cases other than male or with no gender populated, you
would need to type gender != male or gender is missing.
Example:
• Find all files with an experimental different from genotyping array:
files.experimental_strategy != “Genotyping array”

“>” operator - GREATER THAN
The “>” operator is used to search for files where the value of the specified field is greater than the specified value.
Example:
• Find all cases whose number of days to death is greater than 60:
cases.diagnoses.days_to_death > 60

“>=” operator - GREATER THAN OR EQUALS
The “>=” operator is used to search for files where the value of the specified field is greater than or equal to the specified value.
Example:
• Find all cases whose number of days to death is equal or greater than 60:
cases.diagnoses.days_to_death >= 60

“<” operator - LESS THAN
The “<” operator is used to search for files where the value of the specified field is less than the specified value.
Example:
• Find all cases whose age at diagnosis is less than 400 days:
cases.diagnoses.age_at_diagnosis < 400

“<=” operator - LESS THAN OR EQUALS
The “<=” operator is used to search for files where the value of the specified field is less than or equal to the specified value.
Example:
• Find all cases with a number of days to death less than or equal to 20:
cases.diagnoses.days_to_death <= 20

“IN” Operator
The “IN” operator is used to search for files where the value of the specified field is one of multiple specified values. The values
are specified as a comma-delimited list, surrounded by brackets [ ].
Using “IN” is equivalent to using multiple ‘EQUALS (=)’ statements, but is shorter and more convenient. That is, typing
‘project IN [ProjectA, ProjectB, ProjectC]’ is the same as typing ‘project = “ProjectA” OR project = “ProjectB” OR project =
“ProjectC” ’.
Examples:
84

• Find all files in breast, breast and lung and cancer:
cases.project.primary_site IN [Brain, Breast,Lung]
• Find all files tagged with exon or junction or hg19:
files.data_type IN [“Aligned reads”, “Unaligned reads”]

“EXCLUDE” Operator
The “EXCLUDE” operator is used to search for files where the value of the specified field is not one of multiple specified values.
Using “EXCLUDE” is equivalent to using multiple ‘NOT_EQUALS (!=)’ statements, but is shorter and more convenient. That is,
typing ‘project EXCLUDE [ProjectA, ProjectB, ProjectC]’ is the same as typing ‘project != “ProjectA” OR project != “ProjectB”
OR project != “ProjectC” ’
The “EXCLUDE” operator will not match a field that has no value (i.e. a field that is empty). For example, ‘experimental
strategy EXCLUDE [“WGS”,“WXS”]’ will only match files that have an experimental strategy and the experimental strategy is
not “WGS” or “WXS”. To find files with an experimental strategy different from than “WGS” or “WXS” or is not assigned,
you would need to type: files.experimental_strategy in [“WXS”,“WGS”] or files.experimental_strategy is missing.
Examples:
• Find all files where experimental strategy is not WXS, WGS, Genotyping array:
files.experimental_strategy EXCLUDE [WXS, WGS, “Genotyping array”]

“IS MISSING” Operator
The “IS” operator can only be used with “MISSING”. That is, it is used to search for files where the specified field has no value.
Examples:
• Find all cases where gender is missing:
cases.demographic.gender is MISSING

“NOT MISSING” Operator
The “NOT” operator can only be used with “MISSING”. That is, it is used to search for files where the specified field has a value.
Examples:
• Find all cases where race is not missing:
cases.demographic.race NOT MISSING

Special Cases
Date format
The date format should be the following: YYYY-MM-DD (without quotes).
Example:
1

files.updated_datetime > 2015-12-31

85

Using Quotes
A value must be quoted if it contains a space. Otherwise the advanced search will not be able to interpret the value.
Quotes are not necessary if the value consists of one single word.
• Example: Find all cases with primary site is brain and data type is copy number variation:
cases.project.primary_site = Brain and files.data_type = “Copy number variation”

Age at Diagnosis - Unit in Days
The unit for age at diagnosis is in days. The user has to convert the number of years to number of days.
The conversion factor is 1 year = 365.25 days
• Example: Find all cases whose age at diagnosis > 40 years old (40 * 365.25)
cases.diagnoses.age_at_diagnosis > 14610

Fields Reference
The full list of fields available on the GDC Data Portal can be found through the GDC API using the following endpoint:
https://api.gdc.cancer.gov/gql/_mapping
Alternatively, a static list of fields is available below (not exhaustive).

Files
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

files.access
files.acl
files.archive.archive_id
files.archive.revision
files.archive.submitter_id
files.center.center_id
files.center.center_type
files.center.code
files.center.name
files.center.namespace
files.center.short_name
files.data_format
files.data_subtype
files.data_type
files.experimental_strategy
files.file_id
files.file_name
files.file_size
files.md5sum
files.origin
files.platform
files.related_files.file_id
files.related_files.file_name
files.related_files.md5sum
files.related_files.type
86

•
•
•
•

files.state
files.state_comment
files.submitter_id
files.tags

Cases
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

cases.case_id
cases.submitter_id
cases.diagnoses.age_at_diagnosis
cases.diagnoses.days_to_death
cases.demographic.ethnicity
cases.demographic.gender
cases.demographic.race
cases.diagnoses.vital_status
cases.project.disease_type
cases.project.name
cases.project.program.name
cases.project.program.program_id
cases.project.project_id
cases.project.state
cases.samples.sample_id
cases.samples.submitter_id
cases.samples.sample_type
cases.samples.sample_type_id
cases.samples.shortest_dimension
cases.samples.time_between_clamping_and_freezing
cases.samples.time_between_excision_and_freezing
cases.samples.tumor_code
cases.samples.tumor_code_id
cases.samples.current_weight
cases.samples.days_to_collection
cases.samples.days_to_sample_procurement
cases.samples.freezing_method
cases.samples.initial_weight
cases.samples.intermediate_dimension
cases.samples.is_ffpe
cases.samples.longest_dimension
cases.samples.oct_embedded
cases.samples.pathology_report_uuid
cases.samples.portions.analytes.a260_a280_ratio
cases.samples.portions.analytes.aliquots.aliquot_id
cases.samples.portions.analytes.aliquots.amount
cases.samples.portions.analytes.aliquots.center.center_id
cases.samples.portions.analytes.aliquots.center.center_type
cases.samples.portions.analytes.aliquots.center.code
cases.samples.portions.analytes.aliquots.center.name
cases.samples.portions.analytes.aliquots.center.namespace
cases.samples.portions.analytes.aliquots.center.short_name
cases.samples.portions.analytes.aliquots.concentration
cases.samples.portions.analytes.aliquots.source_center
cases.samples.portions.analytes.aliquots.submitter_id
cases.samples.portions.analytes.amount
87

•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

cases.samples.portions.analytes.analyte_id
cases.samples.portions.analytes.analyte_type
cases.samples.portions.analytes.concentration
cases.samples.portions.analytes.spectrophotometer_method
cases.samples.portions.analytes.submitter_id
cases.samples.portions.analytes.well_number
cases.samples.portions.center.center_id
cases.samples.portions.center.center_type
cases.samples.portions.center.code
cases.samples.portions.center.name
cases.samples.portions.center.namespace
cases.samples.portions.center.short_name
cases.samples.portions.is_ffpe
cases.samples.portions.portion_id
cases.samples.portions.portion_number
cases.samples.portions.slides.number_proliferating_cells
cases.samples.portions.slides.percent_eosinophil_infiltration
cases.samples.portions.slides.percent_granulocyte_infiltration
cases.samples.portions.slides.percent_inflam_infiltration
cases.samples.portions.slides.percent_lymphocyte_infiltration
cases.samples.portions.slides.percent_monocyte_infiltration
cases.samples.portions.slides.percent_necrosis
cases.samples.portions.slides.percent_neutrophil_infiltration
cases.samples.portions.slides.percent_normal_cells
cases.samples.portions.slides.percent_stromal_cells
cases.samples.portions.slides.percent_tumor_cells
cases.samples.portions.slides.percent_tumor_nuclei
cases.samples.portions.slides.section_location
cases.samples.portions.slides.slide_id
cases.samples.portions.slides.submitter_id
cases.samples.portions.submitter_id
cases.samples.portions.weight

88

Chapter 9

Authentication
Authentication
Overview
The GDC Data Portal provides granular metadata for all datasets available in the GDC. Any user can see a listing of all available
data files, including controlled-access files. The GDC Data Portal also allows users to download open-access files without logging
in. However, downloading of controlled-access files is restricted to authorized users and requires authentication.

Logging into the GDC
To login to the GDC, users must click on the Login button on the top right of the GDC website.
After clicking Login, users authenticate themselves using their eRA Commons login and password. If authentication is successful,
the eRA Commons username will be displayed in the upper right corner of the screen, in place of the “Login” button.
Upon successful authentication, GDC Data Portal users can:
• see which controlled-access files they have access to;
• download controlled-access files directly from the GDC Data Portal;
• download an authentication token for use with the GDC Data Transfer Tool or the GDC API.
Controlled-access files are identified using a “lock” icon:

89

Figure 9.1: Login

90

The rest of this section describes controlled data access features of the GDC Data Portal available to authorized users. For more
information about open and controlled-access data, and about obtaining access to controlled data, see Data Access Processes and
Tools.

User Profile
After logging into the GDC Portal, users can view which projects they have access to by clicking the User Profile section in the
dropdown menu in the top corner of the screen.

Clicking this button shows the list of projects.

GDC Authentication Tokens
The GDC Data Portal provides authentication tokens for use with the GDC Data Transfer Tool or the GDC API. To download a
token:
1. Log into the GDC using your eRA Commons credentials
2. Click the username in the top right corner of the screen
3. Select the “Download token” option
A new token is generated each time the Download Token button is clicked.
For more information about authentication tokens, see Data Security.
NOTE: The authentication token should be kept in a secure location, as it allows access to all data accessible by the associated
user account.

91

Figure 9.2: Token Download Button

Logging Out
To log out of the GDC, click the username in the top right corner of the screen, and select the Logout option.

Figure 9.3: Logout link

92

Chapter 10

File Cart
Cart and File Download
Overview
While browsing the GDC Data Portal, files can either be downloaded individually from file detail pages or collected in the file
cart to be downloaded as a bundle. Clicking on the shopping cart icon that is next to any item in the GDC will add the item to
your cart.

GDC Cart

Cart Summary
The cart page shows a summary of all files currently in the cart:

93

• Number of files
• Number of cases associated with the files
• Total file size
The Cart page also displays two tables:
• File count by project: Breaks down the files and cases by each project
• File count by authorization level: Breaks down the files in the cart by authorization level. A user must be logged into
the GDC in order to download ‘Controlled-Access files’
The cart also directs users how to download files in the cart. For large data files, it is recommended that the GDC Data Transfer
Tool be used.

Cart Items

The Cart Items table shows the list of all the files that were added to the Cart. The table gives the folowing information for each
file in the cart:
• Access: Displays whether the file is open or controlled access. Users must login to the GDC Portal and have the appropriate
credentials to access these files.
• File Name: Name of the file. Clicking the link will bring the user to the file summary page.
• Cases: How many cases does the file contain. Clicking the link will bring the user to the case summary page.
• Project: The Project that the file belongs to. Clicking the link will bring the user to the Project summary page.
• Category: Type of data
• Format: The file format
• Size: The size of the file
• Annotations: Whether there are any annotations

Download Options

There are a few buttons on the Cart page that allow users to download files. The following download options are available:
• Sample Sheet: Downloads a tab-separated file which contains the associated case/sample IDs and sample type for each
file in the cart.
• Metadata: GDC harmonized clinical, biospecimen, and file metadata associated with the files in the cart.
94

• Download Manifest: Download a manifest file for use with the GDC Data Transfer Tool to download files. A manifest
file contains a list of the UUIDs that correspond to the files in the cart.
• Download Cart: Download the files in the Cart directly through the browser. Users have to be cautious of the amount of
data in the cart since this option will not optimize bandwidth and will not provide resume capabilities.
• SRA XML, MAGE-TAB: This option is available in the GDC Legacy Archive only. It is used to download metadata
files associated with the files in the cart.
The cart allows users to download up to 5 GB of data directly through the web browser. This is not recommended for downloading
large volumes of data, in particular due to the absence of a retry/resume mechanism. For downloads over 5 GB we recommend
using the GDC Data Transfer Tool.
Note: when downloading multiple files from the cart, they are automatically bundled into one single Gzipped (.tar.gz) file.

GDC Data Transfer Tool
The Download Manifest button will download a manifest file that can be imported into the GDC Data Transfer Tool. Below is
an example of the contents of a manifest file used for download:
1
2
3
4
5
6
7
8

id filename
md5 size
state
4ea9c657-8f85-44d0-9a77-ad59cced8973
2516051 live
b8342cd5-330e-440b-b53a-1112341d87db
4523632 live
c57673ac-998a-4a50-a12b-4cac5dc3b72e
4195746 live
3f22dd8d-59c8-43a4-89cf-3b595f2e5a06
6257840 live
7ce05059-9197-4d38-830f-04356f5f851a
6261580 live
8e00d22a-ca6f-4da8-a1c3-f23144cb21b7
6257840 live
96487cd7-8fa8-4bee-9863-17004a70b2e9
6257840 live

mdanderson.org_ESCA.MDA_RPPA_Core.mage-tab.1.1.0.tar.gz
mdanderson.org_SARC.MDA_RPPA_Core.mage-tab.1.1.0.tar.gz
mdanderson.org_KIRP.MDA_RPPA_Core.mage-tab.1.2.0.tar.gz
14-3-3_beta-R-V_GBL1112940.tif

56df0e4b4fc092fc3643bd2e316ac05b

14-3-3_beta-R-V_GBL11066140.tif 6abfee483974bc2e61a37b5499ae9a07
14-3-3_beta-R-V_GBL1112940.tif

56df0e4b4fc092fc3643bd2e316ac05b

14-3-3_beta-R-V_GBL1112940.tif

56df0e4b4fc092fc3643bd2e316ac05b

The Manifest contains a list of the file UUIDs in the cart and can be used together with the GDC Data Transfer Tool to download
all files.
Information on the GDC Data Transfer Tool is available in the GDC Data Transfer Tool User’s Guide.

Individual Files Download
Similar to the files page, each row contains a download button to download a particular file individually.

Controlled Files
If a user tries to download a cart containing controlled files and without being authenticated, a pop-up will be displayed to
offer the user either to download only open access files or to login into the GDC Data Portal through eRA Commons. See
Authentication for details.

95

96

Chapter 11

Legacy Archive
Legacy Archive
The GDC Legacy Archive hosts unharmonized legacy data from repositories that predate the GDC (e.g. CGHub). Legacy data is
not actively maintained, processed, or harmonized by the GDC. Legacy users are encouraged to migrate to harmonized datasets.
The GDC Legacy Archive can be accessed from the GDC Data Portal front page as well as from the “GDC Apps” menu.

Overview
The GDC Legacy Archive contains a limited set of features of the GDC Data Portal:
• Facet search: Ability to look for legacy files or legacy annotations based on case, file and annotation facets.
• File and Annotation tables: List of all the legacy files and list of all the legacy annotations.
• File and Annotation detail pages: Information page for each legacy file and annotation.
97

• Cart: The GDC Legacy Archive and the GDC Data Portal are separate systems with separate download carts.

File Page
The file page of the GDC Legacy Archive is similar to the file page of the GDC Data Portal. It does not include the Workflow,
Reference Genome, and Read Groups sections as these are only applicable to harmonized data available in the GDC Data Portal.
The Legacy Archive includes additional archive information as described below.

98

Archive
If a file was originally produced as part of an archive containing other files, the archive information (Archive ID and number of
files in the archive) is displayed in the file properties and, if selected, the user will see a list of files containing all other files in
that archive.
Metadata files
If a file has any associated MAGE-TAB or SRA XML metadata files, these files will be listed at the bottom of the page. These
files will can be downloaded directly from here. Alternatively, metadata files can be downloaded from the file cart.

File Cart
The file cart in the GDC Legacy Archive is analogous to the file cart of the GDC Data Portal. It provides an additional button to
download any SRA-XML and MAGE-TAB metadata files associated with the files in the cart.

99

Chapter 12

Release Notes
Data Portal Release Notes
Release 1.11.0
• GDC Product: GDC Data Portal
• Release Date: December 21, 2017

New Features and Changes
• Updated UI to support SIFT and Polyphen annotations
• A Sample Sheet can now be created which allows easy association between file names and the case and sample submitter_id
• Updated Advanced Search page to include options to Add All Files to Cart, Download Manifest, and View X Cases
in Exploration
• Provide clear message rather than blank screen if survival plots cannot be calculated for particular cohort comparison
• Display sample_type on associated entities section on file page
• Allows for special characters in case, gene, and mutation set upload (-, :, >, .)

Bugs Fixed Since Last Release
• Fixed error when trying to download large number of files from the Legacy Archive cart
• Fixed number of annotations displayed in Legacy Archive for particular entities
• Replaced missing bars to indicate proportion of applicable files and cases on project entity page in Cases and File Counts
by Data Category table
• Fixed project page display when projects are selected that contain no mutation data in the facet panel
• Fixed error where exporting case sets as TSV included fewer cases than the total
• Fixed error in exploration section when adding custom facets. Previously selecting ‘Only show fields with values’ did not
result in the expected behavior
• Fixed error where number of associated entities for a file was showing an incorrect number

Known Issues and Workarounds
• Sample sheet will download with a file name including the date duplicated (e.g. gdc_sample_sheet_YYYY-MM-DD_HHMM.tsv.YYYY-MM-DD_HH-MM.tsv)
• Custom facet filters
– Definitions are missing from the property list when adding custom facet file or case filters

100

• Visualizations
– Data Portal graphs cannot be exported as PNG images in Internet Explorer. Graphs can be exported in PNG or SVG
format from Chrome or Firefox browsers . Internet Explorer does not display chart legend and title when re-opening
previously downloaded SVG files, the recommendation is to open downloaded SVG files with another program.
– In the protein viewer there may be overlapping mutations. In this case mousing over a point will just show a single
mutation and the other mutations at this location will not be apparent.
• Entity page
– On the mutation entity page, in the Consequences Table, the “Coding DNA Change” column is not populated for rows
that do not correspond to the canonical mutation.
• Repository and Cart
– The annotation count in File table of Repository and Cart does not link to the Annotations page anymore. The user
can navigate to the annotations through the annotation count in Repository - Case table.
• Legacy Archive
– Downloading a token in the GDC Legacy Archive does not refresh it. If a user downloads a token in the GDC Data
Portal and then attempts to download a token in the GDC Legacy Archive, an old token may be provided. Reloading
the Legacy Archive view will allow the user to download the updated token.
– Exporting the Cart table in JSON will export the GDC Archive file table instead of exporting the files in the Cart only.
• Web Browsers
– Browsers limit the number of concurrent downloads, it is generally recommended to add files to the cart and download
large number of files through the GDC Data Transfer Tool, more details can be found on GDC Website.
– The GDC Portals are not compatible with Internet Explorer running in compatibility mode. Workaround is to disable
compatibility mode.
Release details are maintained in the GDC Data Portal Change Log.

Release 1.10.0
• GDC Product: GDC Data Portal
• Release Date: November 16, 2017

New Features and Changes
•
•
•
•
•

Support for uploading Case and Mutation sets in Exploration page
Support for saving, editing, removing Case, Gene and Mutation sets in the Exploration page
Added a Managed Sets menu where the user can see their saved sets
Added an Analysis menu with two analyses: Set Operation and Cohort Comparison
Added a User Profile page that shows all the projects and permissions assigned to the user: available in the username
dropdown after the user logs in

Bugs Fixed Since Last Release
• Project page
– On the project page, the Summary Case Count link should open the case tab on the Repository page - instead it opens
the file page

101

Known Issues and Workarounds
• Custom facet filters
– Definitions are missing from the property list when adding custom facet file or case filters
– Selecting ‘Only show fields with values’ will show some fields without values in the Repository section. This works
correctly under the Exploration section.
• Visualizations
– Data Portal graphs cannot be exported as PNG images in Internet Explorer. Graphs can be exported in PNG or SVG
format from Chrome or Firefox browsers . Internet Explorer does not display chart legend and title when re-opening
previously downloaded SVG files, the recommendation is to open downloaded SVG files with another program.
– In the protein viewer there may be overlapping mutations. In this case mousing over a point will just show a single
mutation and the other mutations at this location will not be apparent.
• Entity page
– On the mutation entity page, in the Consequences Table, the “Coding DNA Change” column is not populated for rows
that do not correspond to the canonical mutation.
• Repository and Cart
– The annotation count in File table of Repository and Cart does not link to the Annotations page anymore. The user
can navigate to the annotations through the annotation count in Repository - Case table.
• Legacy Archive
– Downloading a token in the GDC Legacy Archive does not refresh it. If a user downloads a token in the GDC Data
Portal and then attempts to download a token in the GDC Legacy Archive, an old token may be provided. Reloading
the Legacy Archive view will allow the user to download the updated token.
– Exporting the Cart table in JSON will export the GDC Archive file table instead of exporting the files in the Cart only.
• Web Browsers
– Browsers limit the number of concurrent downloads, it is generally recommended to add files to the cart and download
large number of files through the GDC Data Transfer Tool, more details can be found on GDC Website.
– The GDC Portals are not compatible with Internet Explorer running in compatibility mode. Workaround is to disable
compatibility mode.
Release details are maintained in the GDC Data Portal Change Log.

Release 1.9.0
• GDC Product: GDC Data Portal
• Release Date: October 24, 2017

New Features and Changes
• Support for projects with multiple primary sites per project
• Support for slides that are linked to sample rather than portion

Bugs Fixed Since Last Release
None

102

Known Issues and Workarounds
• Visualizations
– Data Portal graphs cannot be exported as PNG images in Internet Explorer. Graphs can be exported in PNG or SVG
format from Chrome or Firefox browsers . Internet Explorer does not display chart legend and title when re-opening
previously downloaded SVG files, the recommendation is to open downloaded SVG files with another program.
– In the protein viewer there may be overlapping mutations. In this case mousing over a point will just show a single
mutation and the other mutations at this location will not be apparent.
• Project page
– On the project page, the Summary Case Count link should open the case tab on the Repository page - instead it opens
the file page
• Entity page
– On the mutation entity page, in the Consequences Table, the “Coding DNA Change” column is not populated for rows
that do not correspond to the canonical mutation.
• Repository and Cart
– The annotation count in File table of Repository and Cart does not link to the Annotations page anymore. The user
can navigate to the annotations through the annotation count in Repository - Case table.
• Legacy Archive
– Downloading a token in the GDC Legacy Archive does not refresh it. If a user downloads a token in the GDC Data
Portal and then attempts to download a token in the GDC Legacy Archive, an old token may be provided. Reloading
the Legacy Archive view will allow the user to download the updated token.
– Exporting the Cart table in JSON will export the GDC Archive file table instead of exporting the files in the Cart only.
• Web Browsers
– Browsers limit the number of concurrent downloads, it is generally recommended to add files to the cart and download
large number of files through the GDC Data Transfer Tool, more details can be found on GDC Website.
– The GDC Portals are not compatible with Internet Explorer running in compatibility mode. Workaround is to disable
compatibility mode.
Release details are maintained in the GDC Data Portal Change Log.

Release 1.8.0
• GDC Product: GDC Data Portal
• Release Date: August 22, 2017

New Features and Changes
Major features/changes:
• A feature that links the exploration and repository pages was added. For example:
– In the exploration page, cases with a specific mutation could be selected. This set could then be linked to the repository
page to download the data files associated with these cases.
– In the repository menu, the user can select cases associated with specific files. The set could then be linked to
exploration page to view the variants associated with this set of cases.
• Users can now upload a custom gene list to the exploration page and leverage the GDC search and visualization features for
cases and variants associated with the gene set.
• Filters added for the gene entity page. For example:
103

– Clicking on a mutated gene from the project page will display mutations associated with the gene that are present in
this project (filtered protein viewer, etc.).
– Clicking on a mutated gene from the exploration page will display the mutations associated with the gene filtered by
additional search criteria, such as “primary site is Kidney and mutation impact is high”.
• UUIDs are now hidden from tables and charts to simplify readability. The UUIDs can still be exported and viewed in the
tables using the “arrange columns” feature. In the mutation table, UUIDs are automatically exported.
• Mutation entity page - one consequence per transcript is shown (10 rows by default) in the consequence table. The user
should display all rows before exporting the table.

Bugs Fixed Since Last Release
• Exploration
– Combining “Variant Caller” mutation filter with a case filter will display incorrect counts in the mutation facet. The
number of mutations in the resulting mutation table is correct.
– Mutation table: it is difficult to click on the denominator in “#Affected Cases in Cohort” column displayed to the
left side of the bar. The user should click at a specific position at the top of the number to be able to go to the
corresponding link.

Known Issues and Workarounds
• Visualizations
– Data Portal graphs cannot be exported as PNG images in Internet Explorer. Graphs can be exported in PNG or SVG
format from Chrome or Firefox browsers . Internet Explorer does not display chart legend and title when re-opening
previously downloaded SVG files, the recommendation is to open downloaded SVG files with another program.
– In the protein viewer there may be overlapping mutations. In this case mousing over a point will just show a single
mutation and the other mutations at this location will not be apparent.
• Project page
– On the project page, the Summary Case Count link should open the case tab on the Repository page - instead it opens
the file page
• Entity page
– On the mutation entity page, in the Consequences Table, the “Coding DNA Change” column is not populated for rows
that do not correspond to the canonical mutation.
• Repository and Cart
– The annotation count in File table of Repository and Cart does not link to the Annotations page anymore. The user
can navigate to the annotations through the annotation count in Repository - Case table.
• Legacy Archive
– Downloading a token in the GDC Legacy Archive does not refresh it. If a user downloads a token in the GDC Data
Portal and then attempts to download a token in the GDC Legacy Archive, an old token may be provided. Reloading
the Legacy Archive view will allow the user to download the updated token.
– Exporting the Cart table in JSON will export the GDC Archive file table instead of exporting the files in the Cart only.
• Web Browsers
– Browsers limit the number of concurrent downloads, it is generally recommended to add files to the cart and download
large number of files through the GDC Data Transfer Tool, more details can be found on GDC Website.
– The GDC Portals are not compatible with Internet Explorer running in compatibility mode. Workaround is to disable
compatibility mode.
Release details are maintained in the GDC Data Portal Change Log.

104

Release 1.6.0
• GDC Product: GDC Data Portal
• Release Date: June 29, 2017

New Features and Changes
There was a major new release of the GDC Data Portal focused on Data Analysis, Visualization, and Exploration (DAVE). Some
important new features include the following:
• New visual for the Homepage: a human body provides the number of Cases per Primary Site with a link to an advanced
Cancer Projects search
• The Projects menu provides the Top 20 Cancer Genes across the GDC Projects and the Case Distribution per Project
• A new menu “Exploration” is an advanced Cancer Projects search which provides the ability to apply Case, Gene, and
Mutation filters to look for:
–
–
–
–

List of Cases with the largest number of Somatic Mutations
The most frequently mutated Genes
The most frequent Variants
Oncogrid view of mutation frequency

• Visualizations are provided across the Project, Case, Gene and Mutation entity pages:
–
–
–
–
•
•
•
•

List of most frequently mutated genes and most frequent variants
Survival plots for patients with or without specific variants
Survival plots for patients with or without variants in specific genes
Lollipop plots of mutation frequency across protein domains

Links to external databases (COSMIC, dbSNP, Uniprot, Ensembl, OMIM, HGNC)
Quick Search for Gene and Mutation entity pages
The ability to export the current view of a table in TSV
Retired GDC cBioPortal

For detailed updates please review the Data Portal User Guide.

Bugs Fixed Since Last Release
• BAM Slicing dialog box does not disappear automatically upon executing the BAM slicing function. The box can be closed
manually.
• Very long URLs will produce a 400 error. Users may encounter this after clicking on “source files” on a file page where the
target file is derived from hundreds of other files such as for MAF files.
• If bam slicing produces an error pop-up message it will be obscured behind the original dialog box.
– Internet Explorer users are not able to use the “Only show fields with no values” when adding custom facets
– Exporting large tables in the Data Portal may produce a 500 error. Filtering this list to include fewer cases or files
should eliminate the error

Known Issues and Workarounds
• New Visualizations
– Cannot export Data Portal graphs in PNG in Internet Explorer. Graphs can be exported to PNG or SVG from
Chrome or Firefox browsers . Internet would not display chart legend and title when re-opening previously downloaded
SVG files, recommendation is to open downloaded SVG files with another software.
– In the protein viewer there may be overlapping mutations. In this case mousing over a point will just show a single
mutation and the other mutations at this location will not be apparent.
105

• Exploration
– Combining “Variant Caller” mutation filter with a case filter will display wrong counts in the mutation facet. The
number of mutations in the result mutation table is correct.
– Mutation table: it is difficult to click on the denominator in “#Affected Cases in Cohort” column displayed to the
left side of the bar. The user should click at a specific position at the top of the number to be able to go to the
corresponding link.
• Entity page
– On the mutation entity page, in the Consequences Table, the “Coding DNA Change” column is not populated for rows
that do not correspond to the canonical mutation.
• Repository and Cart
– The annotation count in File table of Repository and Cart does not link to the Annotations page anymore. The user
can navigate to the annotations through the annotation count in Repository - Case table.
• Legacy Archive
– Downloading a token in the GDC Legacy Archive does not refresh it. If a user downloads a token in the GDC Data
Portal and then attempts to download a token in the GDC Legacy Archive, an old token may be provided. Reloading
the Legacy Archive view will allow the user to download the updated token.
– Exporting the Cart table in JSON will export the GDC Archive file table instead of exporting the files in the Cart only.
• Web Browsers
– Browsers limit the number of concurrent downloads, it is generally recommended to add files to the cart and download
large number of files through the GDC Data Transfer Tool, more details can be found on GDC Website.
– The GDC Portals are not compatible with Internet Explorer running in compatibility mode. Workaround is to disable
compatibility mode.
Release details are maintained in the GDC Data Portal Change Log.

Release 1.5.2
• GDC Product: GDC Data Portal
• Release Date: May 9, 2017

New Features and Changes
• Removed link to Data Download Statistics Report
• Updated version numbers of API, GDC Data Portal, and Data Release

Bugs Fixed Since Last Release
• None

Known Issues and Workarounds
• General
– Exporting large tables in the Data Portal may produce a 500 error. Filtering this list to include fewer cases or files
should eliminate the error
– After successful authentication, the authentication popup does not close for Internet Explorer users running in
“Compatibility View”. Workaround is to uncheck “Display Intranet sites in Compatibility View” in Internet Explorer
options. Alternatively, refreshing the portal will correctly display authentication status.

106

– BAM Slicing dialog box does not disappear automatically upon executing the BAM slicing function. The box can be
closed manually.
– Due to preceding issue, If bam slicing produces an error pop-up message it will be obscured behind the original dialog
box.
– Very long URLs will produce a 400 error. Users may encounter this after clicking on “source files” on a file page where
the target file is derived from hundreds of other files such as for MAF files. To produce a list of source files an API call
can be used with the search parameter “fields=analysis.input_files.file_name”.
∗ Downloading a token in the GDC Legacy Archive does not refresh it. If a user downloads a token in the GDC
Data Portal and then attempts to download a token in the GDC Legacy Archive, an old token may be provided.
Reloading the Legacy Archive view will allow the user to download the updated token.
Example
1

https://api.gdc.cancer.gov/files/455e26f7-03f2-46f7-9e7a-9c51ac322461?pretty=true&fields=analysis.input_files.fi
• Cart
– Counts displayed in the top right of the screen, next to the Cart icon, may become inconsistent if files are removed
from the server.
• Web Browsers
– Browsers limit the number of concurrent downloads, it is generally recommended to add files to the cart and download
large number of files through the GDC Data Transfer Tool, more details can be found on GDC Website.
– Internet Explorer users are not able to use the “Only show fields with no values” when adding custom facets
– The GDC Portals are not compatible with Internet Explorer running in compatibility mode. Workaround is to disable
compatibility mode.
Release details are maintained in the GDC Data Portal Change Log.

Release 1.4.1
• GDC Product: GDC Data Portal
• Release Date: October 31, 2016

New Features and Changes
•
•
•
•
•
•
•
•
•
•

Added a search feature to help users select values of interest in certain facets that have many values.
Added support for annotation ID queries in quick search.
Added a warning when a value greater than 90 is entered in the “Age at Diagnosis” facet.
Added Sample Type column to file entity page.
Authentication tokens are refreshed every time they are downloaded from the GDC Data Portal.
Buttons are inactive when an action is in progress.
Improved navigation features in the overview chart on portal homepage.
Removed State/Status from File and Case entity pages
Removed the “My Projects” feature.
Removed “Created” and “Updated” dates from clinical and biospecimen entities.

Bugs Fixed Since Last Release
• Advanced search did not accept negative values for integer fields.
• Moving from facet search to advanced search resulted in an incorrect advanced search query.
• Some facets were cut off in Internet Explorer and Firefox.

107

Known Issues and Workarounds
• General
– Exporting large tables in the Data Portal may produce a 500 error. Filtering this list to include fewer cases or files
should eliminate the error
– After successful authentication, the authentication popup does not close for Internet Explorer users running in
“Compatibility View”. Workaround is to uncheck “Display Intranet sites in Compatibility View” in Internet Explorer
options. Alternatively, refreshing the portal will correctly display authentication status.
– BAM Slicing dialog box does not disappear automatically upon executing the BAM slicing function. The box can be
closed manually.
– Due to preceding issue, If bam slicing produces an error pop-up message it will be obscured behind the original dialog
box.
– Very long URLs will produce a 400 error. Users may encounter this after clicking on “source files” on a file page where
the target file is derived from hundreds of other files such as for MAF files. To produce a list of source files an API call
can be used with the search parameter “fields=analysis.input_files.file_name”.
∗ Downloading a token in the GDC Legacy Archive does not refresh it. If a user downloads a token in the GDC
Data Portal and then attempts to download a token in the GDC Legacy Archive, an old token may be provided.
Reloading the Legacy Archive view will allow the user to download the updated token.
Example
1

https://api.gdc.cancer.gov/files/455e26f7-03f2-46f7-9e7a-9c51ac322461?pretty=true&fields=analysis.input_files.fi
• Cart
– Counts displayed in the top right of the screen, next to the Cart icon, may become inconsistent if files are removed
from the server.
• Web Browsers
– Browsers limit the number of concurrent downloads, it is generally recommended to add files to the cart and download
large number of files through the GDC Data Transfer Tool, more details can be found on GDC Website.
– Internet Explorer users are not able to use the “Only show fields with no values” when adding custom facets
– The GDC Portals are not compatible with Internet Explorer running in compatibility mode. Workaround is to disable
compatibility mode.
Release details are maintained in the GDC Data Portal Change Log.

Release 1.3.0
• GDC Product: GDC Data Portal
• Release Date: September 7, 2016

New Features and Changes
• A new “Metadata” button on the cart page to download merged clinical, biospecimen, and file metadata in a single
consolidated JSON file. May require clearing browser cache
• Added a banner on the Data Portal to help users find data
• Added support for “Enter” key on login button
• On the Data page, the browser will remember which facet tab was selected when hitting the “Back” button
• In file entity page, if there is a link to one single file, redirect to this file’s entity page instead of a list page.

108

Bugs Fixed Since Last Release
•
•
•
•

Adding a mix of open and controlled files to the cart from any Case entity pages was creating authorization issues
Opening multiple browser tabs and adding files in those browser tabs was not refreshing the cart in other tabs.
When user logs in from the advanced search page, the login popup does not automatically close
When removing a file from the cart and clicking undo, GDC loses track of permission status of the user towards this file
and will ask for the user to log-in again.
• Download File Metadata button produces incomplete JSON output omitting such fields as file_name and submitter_id.
The current workaround includes using the API to return file metadata.
• Annotations notes do not wrap to the next line at the beginning or the end of a word, some words might be split in two lines
• Sorting annotations by Case UUID causes error

Known Issues and Workarounds
• General
– When no filters are engaged in the Legacy Archive or Data Portal, clicking the Download Manifest button may produce
a 500 error and the message “We are currently experiencing issues. Please try again later.”. To avoid this error the
user can first filter by files or cases to reduce the number files added to the manifest.
– After successful authentication, the authentication popup does not close for Internet Explorer users running in
“Compatibility View”. Workaround is to uncheck “Display Intranet sites in Compatibility View” in Internet Explorer
options. Alternatively, refreshing the portal will correctly display authentication status.
– BAM Slicing dialog box does not disappear automatically upon executing the BAM slicing function. The box can be
closed manually.
– Due to preceding issue, If bam slicing produces an error pop-up message it will be obscured behind the original dialog
box.
– Very long URLs will produce a 400 error. Users may encounter this after clicking on “source files” on a file page where
the target file is derived from hundreds of other files such as for MAF files. To produce a list of source files an API call
can be used with the search parameter “fields=analysis.input_files.file_name”.
– On the Legacy Archive, searches for “Case Submitter ID Prefix” containing special characters are not displayed
correctly above the result list. The result list is correct, however.
Example
1

https://api.gdc.cancer.gov/files/455e26f7-03f2-46f7-9e7a-9c51ac322461?pretty=true&fields=analysis.input_files.fi
• Cart
– Counts displayed in the top right of the screen, next to the Cart icon, may become inconsistent if files are removed
from the server.
• Web Browsers
– Browsers limit the number of concurrent downloads, it is generally recommended to add files to the cart and download
large number of files through the GDC Data Transfer Tool, more details can be found on GDC Website.
– Internet Explorer users are not able to use the “Only show fields with no values” when adding custom facets
– The GDC Portals are not compatible with Internet Explorer running in compatibility mode. Workaround is to disable
compatibility mode.
Release details are maintained in the GDC Data Portal Change Log.

Release 1.2.0
• GDC Product: GDC Data Portal
• Release Date: August 9th, 2016
109

New Features and Changes
•
•
•
•

Added
Added
Added
Added

a retry (1x) mechanism for API calls
support for ID fields in custom facets
Case Submitter ID to the Annotation entity page
a link to Biospeciment in the Case entity page

Bugs Fixed Since Last Release
• General.
–
–
–
–

Not possible to use the browser’s back button after hitting a 404 page
404 page missing from Legacy Archive Portal
Table widget icon and export JSON icon should be different
Download SRA XML files from the legacy archive portal might not be possible in some context

• Data and facets
– Default values for age at diagnosis is showing 0 to 89 instead of 0 to 90
– Biospecimen search in the case entity page does not highlight (but does bold and filter) results in yellow when title
case is not followed
– Table sorting icon does not include numbers
– ‘–’ symbol is missing on empty fields (blank instead), additional missing fields identified since last release. ###
Known Issues and Workarounds
• General
– When no filters are engaged in the Legacy Archive or Data Portal, clicking the Download Manifest button may produce
a 500 error and the message “We are currently experiencing issues. Please try again later.”. To avoid this error the
user can first filter by files or cases to reduce the number files added to the manifest.
– After successful authentication, the authentication popup does not close for Internet Explorer users running in “Compatibility View”. This only impact users at the NIH. Workaround is to uncheck “Display Intranet sites in Compatibility
View” in Internet Explorer options. Alternatively, refreshing the portal will correctly display authentication status.
– When user login from the advanced search page, the login popup does not automatically close
• Cart
– When removing a file from the cart and clicking undo, GDC looses track of permission status of the user towards this
file and will ask for the user to log-in again.
– Counts displayed in the top right of the screen, next to the Cart icon, might get inconsistent if files are removed from
the server.
– Download File Metadata button produces incomplete JSON output omitting such fields as file_name and submitter_id.
The current workaround includes using the API to return file metadata.
• Annotations
– Annotations notes do not wrap to the next line at the beginning or the end of a word, some words might be split in
two lines
– Sorting annotations by Case UUID causes error
• Web Browsers
– Browsers limit the number of concurrent downloads, it is generally recommended to add files to the cart and download
large number of files through the GDC Data Transfer Tool, more details can be found on GDC Website.
– Internet Explorer users are not able to use the “Only show fields with no values” when adding custom facets
– The GDC Portals are not compatible with Internet Explorer running in compatibility mode. Workaround is to disable
compatibilty mode
Release details are maintained in the GDC Data Portal Change Log.

110

Release 1.1.0
• GDC Product: GDC Data Portal
• Release Date: June 1st, 2016

New Features and Changes
• This is a bug-fixing release, no new features were added.

Bugs Fixed Since Last Release
• General
–
–
–
–
–
–

Fixed 508 compliance issues.
Disabled download manifest action on projects without files.
Updated the portal to indicate to the user that his session expired when he tries to download the authentication token.
Unselected “My project” filter after user logs-in.
Fixed missing padding when query includes “My Projects”.
Enforced “Add to cart” limitation to 10,000 files everywhere on the Data Portal.

• Tables
– Improved usability of the “Sort” feature
– Updated the “Add all files to cart” button to add all files corresponding to the current query (and not only displayed
files).
– Fixed an issue where Platform would show “0” when selected platform is “Affymetrix SNP 6.0”.
• Data
– Corrected default values populated when adding a custom range facet.
– Fixed an issue preventing the user to sort by File Submitter ID in data tables.
• File Entity Page
– Improved “Associated Cases/Biospecimen” table for files associated to a lot of cases.
– Fixed an error when performing BAM Slicing.

Known Issues and Workarounds
• General.
– After successful authentication, the authentication popup does not close for Internet Explorer users running in “Compatibility View”. This only impact users at the NIH. Workaround is to uncheck “Display Intranet sites in Compatibility
View” in Internet Explorer options. Alternatively, refreshing the portal will correctly display authentication status.
– Download SRA XML files from the legacy archive portal might not be possible in some context
– Not possible to use the browser’s back button after hitting a 404 page
– 404 page missing from Legacy Archive Portal
– Table widget icon and export JSON icon should be different
• Data and facets
– Default values for age at diagnosis is showing 0 to 89 instead of 0 to 90
– Biospecimen search in the case entity page does not highlight (but does bold and filter) results in yellow when title
case is not followed
– Table sorting icon does not include numbers
– ‘–’ symbol is missing on empty fields (blank instead), additional missing fields identified since last release.
• Cart

111

– When removing a file from the cart and clicking undo, GDC looses track of permission status of the user towards this
file and will ask for the user to log-in again.
– Counts displayed in the top right of the screen, next to the Cart icon, might get inconsistent if files are removed from
the server.
• Annotations
– Annotations notes do not wrap to the next line at the beginning or the end of a word, some words might be split in
two lines
• Web Browsers
– Browsers limit the number of concurrent downloads, it is generally recommended to add files to the cart and download
large number of files through the GDC Data Transfer Tool, more details can be found on GDC Website.
– Internet Explorer users are not able to use the “Only show fields with no values” when adding custom facets
– The GDC Portals are not compatible with Internet Explorer running in compatibility mode. Workaround is to disable
compatibilty mode
Release details are maintained in the GDC Data Portal Change Log.

Release 1.0.1
• GDC Product: GDC Data Portal
• Release Date: May 18, 2016

New Features and Changes
• This is a bug-fixing release, no new features were added.

Bugs Fixed Since Last Release
• Tables and Export
– Restore default table column arrangement does not restore to the default but it restores to the previous state
• Cart and Download
– Make the cart limit warning message more explanatory
– In some situations, adding filtered files to the cart might fail
• Layout, Browser specific and Accessibility
–
–
–
–
–

When disabling CSS, footer elements are displayed out of order
If javascript is disabled html tags are displayed in the warning message
Layout issues when using the browser zoom in function on tables
Cart download spinner not showing at the proper place
Not all facets are expanded by default when loading the app

Known Issues and Workarounds
• General
– If a user has previously logged into the Portal and left a session without logging out, if the user returns to the Portal
after the user’s sessionID expires, it looks as if the user is still authenticated. The user cannot download the token and
gets an error message that would not close. The user should clear the cache to properly log out.
– ‘–’ symbol is missing on empty fields (blank instead)
– Download manifest button is available for TARGET projects with 0 files, resulting in error if user clic on button

112

– After successful authentication, the authentication popup does not close for Internet Explorer users running in “Compatibility View”. This only impact users at the NIH. Workaround is to uncheck “Display Intranet sites in Compatibility
View” in Internet Explorer options. Alternatively, refreshing the portal will correctly display authentication status.
• Data
– When adding a custom range facet, default values are incorrectly populated
– The portal might return incorrect match between cases and files when using field cases.samples.portions.created_datetime
(custom facet or advanced search). Note: this is not a UI issue.
– Sorting File Submitter ID option on the file tab result in a Data Portal Error
• Tables and Export
– Table sorting icon does not include numbers
• Browsers limit the number of concurrent downloads, it is generally recommended to add files to the cart and download
large number of files through the GDC Data Transfer Tool, more details can be found on GDC Website.
Release details are maintained in the GDC Data Portal Change Log.

113



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : No
Page Count                      : 114
Page Mode                       : UseOutlines
Author                          : NCI Genomic Data Commons (GDC)
Title                           : GDC Data Portal User's Guide
Subject                         : 
Creator                         : LaTeX with hyperref package
Producer                        : pdfTeX-1.40.14
Create Date                     : 2018:01:22 19:24:57Z
Modify Date                     : 2018:01:22 19:24:57Z
Trapped                         : False
PTEX Fullbanner                 : This is pdfTeX, Version 3.1415926-2.5-1.40.14 (TeX Live 2013/Debian) kpathsea version 6.1.1
EXIF Metadata provided by EXIF.tools

Navigation menu