We have now dedicated processes to handle non-tabular data. We now support the following data.

Methylation data

We use the methylclock package to generate the tabular methylation clocks. Which allows us to upload it to the DataSHIELD backends.

You can start with uploading the data. The raw data is processed by the ds-upload package and the autmatically uploaded to the DataSHIELD backend.

Preparation steps:

  • Reformat id to child_id-value: When you prepared the raw data make sure the ID column in the raw data corresponds with the right child_id in you other data. For instance in the core of outcome variables.
  • Add age information to the methylation data When you generate the methylation data you need to add the covariates data as well. You can do this by filling in the input path for the covariate data in the du.upload.methyl.clock(covariate_data_input_path = "C:\tmp\covariate_data.csv") Make sure you have and “Age” column in the data to use in the clock generation process.

You can generate the clocks and upload to the Armadillo using these commands

Upload to Armadillo

Install the packages

install.packages("dsUploadMethyl", repos = "https://registry.molgenis.org/repository/R", dependencies = TRUE)
install.packages("methylclock", repos = "https://registry.molgenis.org/repository/R", dependencies = TRUE)

If dependencies do not install automatically please install:

install.package("BiocManager")
BiocManager::install(c("minfi", "preprocessCore", "BiocStyle", "Biobase", "impute"))

Login to the Armadillo

login_data <- data.frame(
  server = "https://armadillo.test.molgenis.org", 
  storage = "https://armadillo-minio.test.molgenis.org", 
  driver = "ArmadilloDriver"
)

library(methylclock)
library(dsUploadMethyl)

du.login(login_data)

Upload the data to Armadillo

du.upload.methyl.clocks(
  cohort_id = "gecko",
  methyl_data_input_path = "https://raw.githubusercontent.com/lifecycle-project/ds-upload-methyl/master/inst/examples/data/METHYL/MethylationDataExample55.csv",
  covariate_data_input_path = "https://raw.githubusercontent.com/lifecycle-project/ds-upload-methyl/master/inst/examples/data/METHYL/SampleAnnotationExample55.csv",
  dna_source = 'placenta',
  norm_method = 0,
  dict_version = "1_0")

Upload to Opal

You can generate the clocks and upload to the Opal using these commands:

Install the packages

install.packages(c("dsUploadMethyl","methylclock"), repos = "https://registry.molgenis.org/repository/R", dependencies = TRUE)

Login to the Opal server

login_data <- data.frame(
  server = "https://opal.edge.molgenis.org", 
  username = "administrator", 
  password = "ouf0uPh6", 
  driver = "OpalDriver"
)

library(dsUploadMethyl)

du.login(login_data)

Upload the data

du.upload.methyl.clocks(
  dict_name = "methylclocks",
  methyl_data_input_path = "https://raw.githubusercontent.com/lifecycle-project/ds-upload-methyl/master/inst/examples/data/METHYL/MethylationDataExample55.csv",
  covariate_data_input_path = "https://raw.githubusercontent.com/lifecycle-project/ds-upload-methyl/master/inst/examples/data/METHYL/SampleAnnotationExample55.csv",
  dna_source = 'placenta',
  norm_method = 0,
  dict_version = "1_0",
  data_version = "1_0")

Troubleshooting

You can run into trouble running the methylation upload into Opal. Here are some answers to questions you can encouter.

Files are not loaded

If the input files are not picked up directly please make sure the whole input path is specified:

Example:

du.upload.methyl.clocks(
  cohort_id = "gecko",
  methyl_data_input_path = "C:\Users\yourself\raw_clock_data.csv",
  covariate_data_input_path = "C:\Users\yourself\raw_covariate_data.csv",
  dna_source = 'placenta',
  norm_method = 0,
  dict_version = "1_0",
  data_version = "1_0")

Cannot allocate vector of size n Mb …

Running the methylation data generation can be memory consuming. You need to allocate enough memory to perform the action. You can exceute

memory.limit(size=2500)