Expression levels and normalization. Samples were processed as mixed batches (total of 12 batches) of patients and controls and hybridized to the Illumina WG-6v2 human whole genome bead arrays. Raw data was processed by Bead Studio v. 3.0 software. Expression levels were exported for signal and negative control probes. The set of negative control probes was used to calculate average background level for further filtering and background subtraction steps. Average values of the signal probe expression data for the 137 patient (NSCLC) and 91 control (NHC) sample arrays were used as a base for normalization and all the arrays, including 18 PRE/18 POST samples and NYU samples, were quantile normalized against this base. Array quality control. After each hybridization batch, we computed gene-wise global correlation as a median Spearman correlation across all pairs of microarrays from all batches using expression levels of all signal probes (>48K). Median absolute deviation of the global correlation was also calculated. Then for each microarray a median spearman correlation against all other arrays was computed. The arrays whose median correlation differs from global correlation more than 8 absolute deviations (threshold was picked empirically) were marked as outliers and were not used for further analysis.