During data generation and analysis, a series of files of varying sizes are created.
In Sanger sequencing, the stored data includes unedited chromatograms (“raw” data), edited chromatograms, sequence alignments and summarized results/reports. Equivalent components can be identified within NGS pipelines, although the amount of storage required will be significantly larger.
Some genomic data may need to be repeatedly accessed and analysed over a greater period than expected in typical data retention policies (e.g. whole genome or whole exome data). Where possible, the laboratory should determine the feasibility of very long term data retention. The laboratory should develop a formal data management policy which minimizes the possibility of data loss. During analysis, genomic data will be transferred to a number of different computers for analysis and/or storage.