WhiteCoatWriter ☆ India, 2019-03-21 08:13 (2093 d 03:49 ago) (edited on 2019-03-22 06:17) Posting: # 20063 Views: 10,546 |
|
Dear Forum Members, I have been trying to understand if it is possible to perform CDISC datasets (SDTM and ADAM) on R while the final product is expected to be in .XPT format. While the regulatory doesn't specify software or the means, it mandates that the final output is in .XPT format (which is SAS format). I want to understand, if there will be any issues/challenges in executing and submitting datasets generated using R? — Thanks and Regards Dr Anonymous (WhiteCoatWriter) |
Helmut ★★★ Vienna, Austria, 2019-03-21 11:47 (2093 d 00:14 ago) @ WhiteCoatWriter Posting: # 20064 Views: 9,399 |
|
Hi WhiteCoatWriter, ❝ […] if it is possible to perform CDISC datasets (SDTM and ADAM) on R while the final product is expected to be in .XPT format. Why not? ❝ While the regulatory doesn't specify software or the means, … Correct. The FDA nowhere mandates SAS or recommends any particular software. E.g., Certara provides the Phoenix CDISC Navigator for import/export in Phoenix/WinNonlin. ❝ … it mandates that the final output is in .XPT format (which is SAS format). Correct. Though there is hope to move towards XML. Abstract: At the FDA's request, a CDISC-sponsored group has developed an XML-based replacement for the SAS version 5 transport files that are currently used to send case report tabulations clinical trials data to the FDA. Limitations of version 5 transport files such as 8 character variable names, 40 character labels for variables, a 200 character limit for character data, an idiosyncratic representation for dates and times, and sparse built-in metadata stored with the data have become noticeable and troublesome as such limitations have disappeared from other software used in clinical data management. CDISC has developed the more flexible XML-based Operational Data Model (ODM). ODM provides XML analogs to "forms" and "fields" and it has built-in support for extensive metadata. ODM does not specify names for clinical domains (i.e., adverse events, medical history) or data fields. CDISC's Submissions Data Standards (SDS) specifies a structure and metadata for clinical domains and data fields. ODM, by design, provides an excellent XML format for clinical trials data with the SDS-specified structure and metadata. This ODM/SDS combination is the basis for the XML replacement for version 5 transport files in regulatory submissions. (My emphases. I know, the document is 15 (!) years old.)❝ I want to understand, if there will be any issues/challenged in executing and submitting datasets generated using R? None – if the code is validated… You have to start with the package SASxport , which you have to validate first. Though you can dump the functions to inspect the code (in the R-console type e.g., write.xport ) I suggest to get the source from CRAN. Much easier. Then write your own code to modify the data (rename fields, change formats, etc.). At the end it would be a good idea to check whether your file can be imported into other software. If you don’t have SAS, opt for a trial license of Phoenix or a demo of Stat/Transfer (I paid USD 295 in 2014 for v12; the current v14 comes for USD 349).If you are an experienced R-coder, consider writing a package (see here and there). The community will love you. — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
WhiteCoatWriter ☆ India, 2019-03-21 13:13 (2092 d 22:48 ago) @ Helmut Posting: # 20065 Views: 9,368 |
|
❝ Correct. The FDA nowhere mandates SAS or recommends any particular software. E.g., Certara provides the Phoenix CDISC Navigator for import/export in Phoenix/WinNonlin. While Phoenix needs an additional package to be purchased, I feel that the same can be done by creating tables on phoenix(or importing in .csv) and exporting it to .xpt without the specific package which I assume will only help with limited datasets. ❝ If you are an experienced R-coder, consider writing a package (see here and there). The community will love you. I might not be an expert coder/programmer, but I have been working on this for quite some time. Thank you for the references, will definitely try doing something. It really has been a challenge to get it validated using pinnacle validator. Also, when it comes to Adam datasets it specially requires SAS date format which I assume is not possible to be created on R. Elsewhere, While No other regulatory mandates datasets for Bio equivalence study, I still keep wondering what really sparked the USFDA in the mid of Dec 2016 to make CDISC datasets mandate for regulatory submission. — Thanks and Regards Dr Anonymous (WhiteCoatWriter) |
Helmut ★★★ Vienna, Austria, 2019-03-21 17:43 (2092 d 18:19 ago) @ WhiteCoatWriter Posting: # 20066 Views: 9,355 |
|
Hi WhiteCoatWriter, ❝ While Phoenix needs an additional package to be purchased, I feel that the same can be done by creating tables on phoenix (or importing in .csv) and exporting it to .xpt without the specific package which I assume will only help with limited datasets. I did that myself before I got Certara’s CDISC-license. Never tried anything else than raw-data (scheduled & actual time points, concentrations) and NCA-results. ❝ ❝ If you are an experienced R-coder, consider writing a package … ❝ ❝ I might not be an expert coder/programmer, but I have been working on this for quite some time. Thank you for the references, will definitely try doing something. Needs patience and the learning curve is not flat. A coding environment like R-Studio and an account at GitHub greatly helps. ❝ It really has been a challenge to get it validated using pinnacle validator. I believe it. Congratulations! ❝ Also, when it comes to Adam datasets it specially requires SAS date format which I assume is not possible to be created on R. SAS’ date value (days since 1960-01-01) and datetime (seconds since 1960-01-01 midnight) are a pain in the ass. The format ddmmmyy:hh:mm:ss is just ugly. I played around in R a bit and see what you mean. I wouldn’t say it’s impossible, only tricky.*The “origin” of date- and time-objects in R is (like the UNIX Epoch time) 1970-01-01 00:00:00 UTC. The timezone is important. Is National Language Support (NLS) part of SAS’ base installation? Without, everything is in local time. I found a funny document stating The last counter is the datetime counter. This is the number of seconds since midnight, January 1, 1960. Why January 1, 1960? One story has it that the founders of SAS wanted to use the approximate birth date of the IBM 370 system, and they chose January 1, 1960 as an easy-to-remember approximation. Heck, did they mean the timezone (EST = UTC-5) of Poughkeepsie, NY?The man-page of as.POSIX() claims that the origin of SAS’ datetime is 1960-01-01 00:00:00 GMT but I prefer to avoid a second-hand reference.❝ While No other regulatory mandates datasets for Bio equivalence study, I still keep wondering what really sparked the USFDA in the mid of Dec 2016 to make CDISC datasets mandate for regulatory submission. Not the slightest idea.
— Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
WhiteCoatWriter ☆ India, 2019-03-22 07:56 (2092 d 04:06 ago) @ Helmut Posting: # 20067 Views: 9,187 |
|
Hello again :), ❝ I did that myself before I got Certara’s CDISC-license. Never tried anything else than raw-data (scheduled & actual time points, concentrations) and NCA-results. I find it easier until date, especially handling PK data from the project and exporting as .XPT, However, with advancements of R, I decided to take up this challenge of trying to work out CDISC datasets on R along with my routine PK and STAT analysis. ❝ Needs patience and the learning curve is not flat. A coding environment like R-Studio and an account at GitHub greatly helps. I agree, patience is the key :), I use R studio predominantly and have a GitHub account (need to be more active though). ❝ ....This is the number of seconds since midnight, January 1, 1960. Why January 1, 1960? One story has it that the founders of SAS wanted to use the approximate birth date of the IBM 370 system, and they chose January 1, 1960 as an easy-to-remember approximation.[/indent]Heck, did they mean the timezone (EST = UTC-5) of Poughkeepsie, NY? ❝ The man-page of Woah, thanks for the explanation. Little did I know the background, I believed ISO8601 was majorly the standard time format until SAS complicated it for me. Now I know where things come into place. ❝ ❝ While No other regulatory mandates datasets for Bio equivalence study, I still keep wondering what really sparked the USFDA in the mid of Dec 2016 to make CDISC datasets mandate for regulatory submission. ❝ ❝ Not the slightest idea. There are many banging their heads with this in place and organizations are having to extend their timelines to review this data prior to submission. ❝ 2019-03-21 00:00:00 2019-03-20 23:00:00 1553122800 1868742000 20Mar19:23:00:00 ❝ 2019-03-21 12:00:00 2019-03-21 11:00:00 1553166000 1868785200 21Mar19:11:00:00 ❝ 2019-03-21 15:00:00 2019-03-21 14:00:00 1553176800 1868796000 21Mar19:14:00:00 ❝ 2019-03-21 16:03:24 2019-03-21 15:03:24 1553180604 1868799804 21Mar19:15:03:24 ❝ ❝ To get SAS’ date simply truncate the string with ❝ TODO: Convert Thanks for working it out, I tried replicating this and this looks promising for me to try it out further. Thanks -Dr Anonymous (WhiteCoatWriter) — Thanks and Regards Dr Anonymous (WhiteCoatWriter) |
Helmut ★★★ Vienna, Austria, 2019-03-22 17:58 (2091 d 18:04 ago) @ WhiteCoatWriter Posting: # 20068 Views: 9,291 |
|
Hi Dr Anonymous ❝ I agree, patience is the key :), I use R studio predominantly and have a GitHub account (need to be more active though). So you are an advanced RUser! ❝ ❝ ....This is the number of seconds since midnight, January 1, 1960… ❝ ❝ Woah, thanks for the explanation. Welcome. Didn’t know it until yesterday either. ❝ […] I believed ISO8601 was majorly the standard time format In many countries. In Austria it is not just a norm but legally binding. Rarely used in daily life. ❝ Thanks for working it out, I tried replicating this and this looks promising for me to try it out further. I was curious myself. My latest code: library(lubridate) # Makes job easier but has large footprint. See also: Which gave on my machine: location local.datetime offset UTC.datetime loc.SAS.datetime UTC.SAS.datetime Note the different local times: Poughkeepsie changed EST ↑ EDT on March 10th, Austria will change CET ↑ CEST on March 31th, and New Zealand will change NZDT ↓ NZT on April 7th. See this funny story. — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
WhiteCoatWriter ☆ India, 2019-03-24 08:59 (2090 d 03:03 ago) @ Helmut Posting: # 20078 Views: 10,652 |
|
Hello again ❝ So you are an advanced RUser! I'd say that I am at the basement of being an advanced user. Must credit PAGIN (population approach group in India) for introducing me to R and R studio a couple of years back. Its been a good learning curve thereafter. ❝ Welcome. Didn’t know it until yesterday either. ❝ ❝ […] I believed ISO8601 was majorly the standard time format ❝ ❝ In many countries. In Austria it is not just a norm but legally binding. Rarely used in daily life. Interesting... ❝ I was curious myself. My latest code:.... ❝ Which gave on my machine: ❝ ❝ ❝ ❝ ❝ ❝ WOW, this looks much better, however I guess while we represent in CDISC it has to be in the form of YYYY-MM-DD for the .XPT. I find myself complicating over this now :( . More details to this point can be availed by downloading the CDISC Implementation guide 3.3, section 4.4. P.S: requires a mandate registration for viewing/download. You can download/view the details from HERE ❝ Note the different local times: Poughkeepsie changed EST ↑ EDT on March 10th, Austria will change CET ↑ CEST on March 31th, and New Zealand will change NZDT ↓ NZT on April 7th. ❝ See this funny story. Interesting read. Little did I know about it. Thank you for bringing it up. Additionally, while I am fully trying to understand the datasets structure for submission, my ship hit an iceberg this morning. Under the module structure for submission for datasets as a part of the dossier, as defined by CDISC for USFDA, it requires the SAS programs for ADAM datasets to be given as note file documents. You can get the module structure from HERE This looks clearly like CDISC recommends SAS for datasets (which is like a monopoly and it kinda brings us back to point one from where we started this thread :( ). Thanks and regards -Dr Anonymous (WhiteCoatWriter) — Thanks and Regards Dr Anonymous (WhiteCoatWriter) |
Helmut ★★★ Vienna, Austria, 2019-03-24 13:34 (2089 d 22:28 ago) @ WhiteCoatWriter Posting: # 20079 Views: 11,984 |
|
Hi Dr Anonymous! ❝ […] however I guess while we represent in CDISC it has to be in the form of YYYY-MM-DD for the .XPT. Right. The FDA’s Technical Specifications Document (January 2019) states: 4.1.4.2 Dates in SDTM and SEND I was confused because you wrote: ❝ ❝ ❝ Also, when it comes to Adam datasets it specially requires SAS date format […] Hence, no hassle since ISO 8601 in all of its flavours is the standard in R. See the format option of strptime() for the Date-Time Classes POSIXct and POSIXlt .❝ More details to this point can be availed by […] Sorry, I don’t have an account and don’t want to get one. ❝ Additionally, while I am fully trying to understand the datasets structure for submission, my ship hit an iceberg this morning. ❝ Under the module structure for submission for datasets as a part of the dossier, as defined by CDISC for USFDA, it requires the SAS programs for ADAM datasets to be given as note file documents. ❝ This looks clearly like CDISC recommends SAS for datasets (which is like a monopoly and it kinda brings us back to point one from where we started this thread :( ). Really? According to the FDA’s spec’s: 3.3.1 SAS Transport Format Further down Pharmacokinetics Concentrations (PC) Domain in Section 4.1.3.3 is interesting.
When actual dates or date/time values are available for PCRFTDTC/PPRFTDTC, they can be included. Can‽ We’re in the 21st century, folks! Who works with nominal times – apart from binning mean data – in PK?P.S.: You can add a signature in your profile (SOP). Saves a few keystrokes in future posts. — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
WhiteCoatWriter ☆ India, 2019-03-25 10:39 (2089 d 01:23 ago) @ Helmut Posting: # 20080 Views: 9,038 |
|
Hello Helmut. ❝ Right. The FDA’s Technical Specifications Document (January 2019) states: ❝ 4.1.4.2 Dates in SDTM and SEND ❝ Dates in SDTM and SEND domains should conform to the ISO 8601 format. ❝ I was confused because you wrote: ❝ ❝ ❝ ❝ ❝ Also, when it comes to Adam datasets it specially requires SAS date format […] Well that is true as to what FDA states and is applicable in the SDTM datasets, However in Adam datasets, especially ADSL datasets under certain variables such as treatment start and end, it requires SAS date format, only then we get a successful validation report from Pinnacle 21 (I have tried ISO 8601 and have got validation errors stating "variable is not in sas date format") ❝ Sorry, I don’t have an account and don’t want to get one. ❝ P.S.: You can add a signature in your profile (SOP). Saves a few keystrokes in future posts. Thanks for the tip. worked it out. — Thanks and Regards Dr Anonymous (WhiteCoatWriter) |
Helmut ★★★ Vienna, Austria, 2019-03-25 14:44 (2088 d 21:17 ago) @ WhiteCoatWriter Posting: # 20083 Views: 9,028 |
|
Hi Dr Anonymous, ❝ Well that is true as to what FDA states and is applicable in the SDTM datasets, However in Adam datasets, especially ADSL datasets under certain variables such as treatment start and end, it requires SAS date format, only then we get a successful validation report from Pinnacle 21 (I have tried ISO 8601 and have got validation errors stating "variable is not in sas date format") Now I’m lost. The FDA and Japan’s PMDA are the only agencies requiring CDISC Standards. Both are “Platinum Members” (whatever that means) of CDISC Standards. So far, so good. But why the heck is CDISC requiring the bloody SAS date format for SDTM and SEND when the FDA definitely wants ISO 8601? At the end of the day you will submit your data to an agency and not a validator. I bit the bullet and registered at CDISC. In the “ADaM Implementation Guide” Version 1.1 (Feb 12th, 2016), page 17 I found:
So how does the SAS-format come into play? PINNACLE21 was released on Sep 27th, 2016. Is the validator not compliant with the standards? — Dif-tor heh smusma 🖖🏼 Довге життя Україна! Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes |
WhiteCoatWriter ☆ India, 2019-03-27 07:55 (2087 d 04:06 ago) @ Helmut Posting: # 20084 Views: 8,988 |
|
Hello again... ❝ ❝ ❝ So how does the SAS-format come into play? PINNACLE21 was released on Sep 27th, 2016. Is the validator not compliant with the standards? Possible, I assume...! Seems so that the validator is the only challenge here ... I guess we can justify the error in the reviewers guide we share along with the datasets. It would be upto the regulatory to decide further. Thanks for the support, could take time but I would be glad to work on this and see if I can come up with some package (sooner or later) for CDISC datasets. — Thanks and Regards Dr Anonymous (WhiteCoatWriter) |