Bioequivalence and Bioavailability Forum 14:42 CET

Main page Policy/Terms of Use Abbreviations Latest Posts

 Log in |  Register |  Search

ElMaestro
Hero

Denmark,
2018-11-04 21:54

Posting: # 19527
Views: 633
 

 Isn't it time for an open source chromatographic format [Bioanalytics]

Hi all,



I think it would be great if an open source format for time series of chromatograms data were defined (i.e. something as simple as signal (counts per second etc) as function of time) and if the big software vendors would allow import of such files for quantification with the proprietary peak finding and integration algos in Analyst, MassHunter, MassLynx etc.

There is simply no reason not to do this apart from IP and it would make enormously good sense from a QA perspective. Might also increase competition about companies developing serious algos. Comparing areas of identical appearance between packages is one thing (and currently impossible!), but comparison of absent peaks or S:N < 5 etc would also be facilitated as would bias/behaviour in case of tailing and much more. As of present there is not even a definition of how to quantify S:N (how would you do it? Is your way better than mine? No? In which implementation context does a regulatory requirement about it make sense).

One of the packages happily assigns an RT to a split peak or identifies "a peak on a peak" (and is able to subtract areas accordingly) while other packages don't seem do that. Depending on circumstances this may be desirable or not. I was recently at a CRO were the SOP said that a split peak is defined as one a cluster of CPS offset from baseline assigned two RTs. With the introduction of a station from another vendor and thus other software that SOP actually seemed to become partially irrelevcant for the new station. And so forth. I am using the term 'seemed' here because the behaviour is not defined in manuals and the software vendor is completely silent when asked. No can talk, much secret, highly intellectual property, hush hush.

We need an open source format for chromatographic data. It could be as simple as two columns in a CSV-file plus some meta data.

We need it. Full stop.

if (3) 4

Best regards,
ElMaestro

"(...) targeted cancer therapies will benefit fewer than 2 percent of the cancer patients they’re aimed at. That reality is often lost on consumers, who are being fed a steady diet of winning anecdotes about miracle cures." New York Times (ed.), June 9, 2018.
nobody
Senior

2018-11-05 09:00

@ ElMaestro
Posting: # 19528
Views: 586
 

 Isn't it time for an open source chromatographic format

» There is simply no reason not to do this apart from IP ...

Jepp, and therefore: End of story-

Money - greed - world we live in

Elsevier, the charming little startup can force Swedish ISP Bahnhof to block SciHub. Due to IP BS. This is the world we live in.

I would love to see regulatory guidelines be developed by a github-like tracking system, to make absolutely transparent who introduced which nonsense into which guideline. I think the regulatory authorities owe this transparency to people who pay their salaries. Don't expect this to happen within the next 100 years.

Kindest regards, nobody
ElMaestro
Hero

Denmark,
2018-11-05 13:53

@ nobody
Posting: # 19530
Views: 556
 

 and now that we are at it

How would you calculate s:n for y3 in a series like this:


n=1500
t=c(1:n)/100
y1=rep(0,n)

for (i in 1:314) y1[400+i]=13*sin(i/100)
y2=3*runif(n)*log(t+10)
y3=y1+y2
plot(t ,y3, pch=20, cex=0.1)
lines(t, y3)


As usual, I am not so much asking you how you think others would do it in other contexts. I am asking you how you would do it. Let us for simplicity assume there is a peak around 5 units here and that we know nuffin about anything except y3 and t.
You are of course also welcome to eyeball it and tell me the level of s:n. Make all the assumptions you wish. Change data as you please.

Note: true chromatographic raw data are much uglier than this. Whatever you see when visualizing peaks with or without smoothing, bunching and gamma modality or whatever the heck it is called involves a lot of signal conditioning in all major packages.

if (3) 4

Best regards,
ElMaestro

"(...) targeted cancer therapies will benefit fewer than 2 percent of the cancer patients they’re aimed at. That reality is often lost on consumers, who are being fed a steady diet of winning anecdotes about miracle cures." New York Times (ed.), June 9, 2018.
nobody
Senior

2018-11-05 16:33

@ ElMaestro
Posting: # 19532
Views: 530
 

 and now that we are at it

Hi Danmark!

As I wrote here in the past: Don't mess around tooo much with the LLOQ as this is in general not good for ones mental health. Some issues there might easily blow your mind if you start thinking about too long. Some kind of life-death boarder for your analytical method and a philosophical conundrum of a large scale.

If you have much and important information in the samples around the LLOQ find a better method or forget the whole thing. It's not worth the time you spend in this dark-grey area of our universe...

Kindest regards, nobody
ElMaestro
Hero

Denmark,
2018-11-05 18:19

@ nobody
Posting: # 19533
Views: 510
 

 and now that we are at it

Hello nobody,

it is at the LLOQ the magic is needed. This is where the s:n is measured, also.

At present we have about 70 companies pursuing Tiotropium/Formoterol/Salmeterol/Fluticasone/and so forth approvals and for some of those methods we need an LLOQ of 0.1 pg/mL and not one iota less; even with the API6500 Qtrap and similar equipment this is at the border. The opportunity for assay optimizations have been exhausted (including columns, derivatization, SPE/LLE, and much mnore has been explored), the agency might not approve higher doses than a single shot per subject per period and we need to bear in mind that Cmax/LLOQ should be 20 or better.

You try and see if you can find a way to get approval for Tiotropium with e.g. 0.5 pg/mL. Good luck when meeting the agency :-D

So, in as much as I agree with you that LLOQ should not ever be an issue, LLOQ is a persistent issue. But of course, that really isn't what this thread is all about :-)

if (3) 4

Best regards,
ElMaestro

"(...) targeted cancer therapies will benefit fewer than 2 percent of the cancer patients they’re aimed at. That reality is often lost on consumers, who are being fed a steady diet of winning anecdotes about miracle cures." New York Times (ed.), June 9, 2018.
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2018-11-05 20:39

@ ElMaestro
Posting: # 19534
Views: 500
 

 Transparent ruler – like in the good ol’ days

Hi ElMaestro,

» You are of course also welcome to eyeball it and tell me the level of s:n.

You are a nasty person. Noise increases with time. Before the peak ~6.93, after ~8.30, x ~7.62.
Peak height 13.7 gives S:N ~1.8. Not an awful lot.

[image]



Eyeballing, of course. No silicon-based life form would agree with me.

If you find an error, you can keep it.

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
Ohlbe
Hero

France,
2018-11-05 22:14

@ Helmut
Posting: # 19535
Views: 484
 

 Transparent ruler – like in the good ol’ days

Dear Helmut and ElMaestro,

» Eyeballing, of course.

That's how I do it too. I don't trust automatic calculations by whatever software. It depends so much on the parameters you enter. You may get a S/N of 30 with the plot you provided. Nonsense.

» Peak height 13.7 gives S:N ~1.8. Not an awful lot.

With a visual quick and dirty look I would have said 3, but I was too fast and I was wrong (stupidly took the signal from the bottom of the graph). Revised slower-and-still-dirty estimate 2.

Regards
Ohlbe
ElMaestro
Hero

Denmark,
2018-11-05 22:43

@ Ohlbe
Posting: # 19536
Views: 491
 

 Transparent ruler – like in the good ol’ days

Hi Ohlbe and Hötzi,

Thanks for your qualified opinions.

I am inclined to do this:

lines(c(5.6, 5.6), c(4.1,18), col="green", lwd=6)
lines(c(8, 8), c(4.1,8.8 ), col="red", lwd=6)


where the red line indicates the level of noise (in this case, right of the peak) and the greeen one is the signal. Roughly.

Note that in both cases I quantify s as well as n in one direction from the baseline mean or median or whatever.

Thus I am landing at s:n = (18-4.1) / (8-4.1) = 3.6.
I am not in any way claiming this is better or worse, only that this is my idea of an approach.

If I recall correctly, if you are "a large software vendor" -and I will mention none in particular- you can also do something like:

k=3                   #a miserable sad pointless constant to make s:n look better ??
a=sd (y3[400:714])    #sd of points on the peak
b=sd (y3[800:1000])   #sd of points adjacent to the peak
sn=k*a/b


which gives a result of about 5.:-D
Personally, I would of course always adjust k so that s:n is not less than 10 or so, just to avoid questions. I mean, I care about my data because I am not a nasty person :-D:-D:-D

if (3) 4

Best regards,
ElMaestro

"(...) targeted cancer therapies will benefit fewer than 2 percent of the cancer patients they’re aimed at. That reality is often lost on consumers, who are being fed a steady diet of winning anecdotes about miracle cures." New York Times (ed.), June 9, 2018.
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2018-11-06 02:05

@ ElMaestro
Posting: # 19537
Views: 462
 

 Slow down, you move too fast […] Feelin’ Groovy

Hi ElMaestro,

» I am inclined to do this: [some R-code]
»
» where the red line indicates the level of noise (in this case, right of the peak) and the greeen one is the signal. Roughly.

Disagree and agree.

» Note that in both cases I quantify s as well as n in one direction from the baseline mean or median or whatever.
»
» Thus I am landing at s:n = (18-4.1) / (8-4.1) = 3.6.
» I am not in any way claiming this is better or worse, only that this is my idea of an approach.

OK, you are aware that this is not according to the textbooks about chromatography.1 See also Dolan’s figure 1.2 Noise is defined as the range of recorded “zero” signals. Hence, your S:N is by a factor of two better than mine.

» If I recall correctly, if you are "a large software vendor" -and I will mention none in particular- you can also do something like:
»
» k=3                   #a miserable sad pointless constant to make s:n look better ??
» a=sd (y3[400:714])    #sd of points on the peak
» b=sd (y3[800:1000])   #sd of points adjacent to the peak
» sn=k*a/b

»
» which gives a result of about 5.:-D

Well…

» Personally, I would of course always adjust k so that s:n is not less than 10 or so, just to avoid questions. I mean, I care about my data because I am not a nasty person :-D:-D:-D

Well, your example is artificial. Remember this one? What you constructed here is what you would get when you sample from the A/D-converter at 1.66 Hz. For your peak width that’s by a factor of five to ten too fast. With a reasonable bundling you would get the blue line (no time-constant; moving averages for simplicity) …

[image]


… and, voilà, an S:N of ~4.7. Signal bundling is not done for fun. Peak detection (start/end) requires the 1st derivative of the signal and finding the “peak’s summit” the 2nd. See these examples (slides 11–12). Difficult enough for bundled data, impossible for what you consider “raw data”. It’s simply a necessity for proper integration. Finding the right value is always a compromise. If you bundle too much, peaks will be distorted and separation between adjacent peaks suffers. If you bundle too little, the system will have a hard time detecting the right start/end of peaks. Nothing is perfect.


  1. Kuss H-J, Kromidas S (eds.): Quantification in LC and GC. Weinheim: Wiley-VCH; 2009.
  2. Dolan JW. The Role of the Signal-to-Noise Ratio in Precision and Accuracy. LCGC North America. 2005:23(12);1256–1260.

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
nobody
Senior

2018-11-06 09:29

@ Helmut
Posting: # 19538
Views: 429
 

 Slow down, you move too fast […] Feelin’ Groovy

May I ask what we are doing here? Is this example peak the lowest calibration point? Or do you want to determine s:n for a patient sample (why?)?

Kindest regards, nobody
ElMaestro
Hero

Denmark,
2018-11-06 11:09

@ nobody
Posting: # 19539
Views: 416
 

 Yes it went a little fast

Hello nobody,


» May I ask what we are doing here? Is this example peak the lowest calibration point? Or do you want to determine s:n for a patient sample (why?)?

This was just a digression.
I asked noone in particular what s:n is and gave a example of a fictive simulated chromatogram on which the forum's experts could practice their skills and demonstrate knowledge.
Shouldn't have done it. But now that you are here, what is in your idea of s:n ? How do you calculate it?

Whether the series respresents an LLOQ or a PK sample or something entirely different is immaterial to this discussion.

So.... what is s:n in your opinion? How would you prefer to quantify the level of noise and the level of signal?

if (3) 4

Best regards,
ElMaestro

"(...) targeted cancer therapies will benefit fewer than 2 percent of the cancer patients they’re aimed at. That reality is often lost on consumers, who are being fed a steady diet of winning anecdotes about miracle cures." New York Times (ed.), June 9, 2018.
nobody
Senior

2018-11-06 12:05

@ ElMaestro
Posting: # 19540
Views: 407
 

 Yes it went a little fast

To be true: I prefer to know all (a lot of) facts, before I propose any solution to the problem. Maybe not so easy in a public forum, I know.

But in my opinion the solution has to fit the problem you are addressing. Otherwise there would already be a DIN/ISO in place, or? :-D

s:n is frequently a problem in trace analytics...

Kindest regards, nobody
ElMaestro
Hero

Denmark,
2018-11-06 12:10

@ nobody
Posting: # 19541
Views: 404
 

 Yes it went a little fast

Hello nobody,

» But in my opinion the solution has to fit the problem you are addressing. Otherwise there would already be a DIN/ISO in place, or? :-D
»
» s:n is frequently a problem in trace analytics...

Clever remark.
Feel free to make assumptions, change data, and feel free to assume it is a spiked plasma sample at the LLOQ in a bioaalytical validation or feel free to make any other assumption that allows you to think you have a solution to the question.

I am curious as to what the solution is, as in, how exctly is signal and noise quantified (once I have that I think I can make the division myself, but I am not even 100% sure about it).

if (3) 4

Best regards,
ElMaestro

"(...) targeted cancer therapies will benefit fewer than 2 percent of the cancer patients they’re aimed at. That reality is often lost on consumers, who are being fed a steady diet of winning anecdotes about miracle cures." New York Times (ed.), June 9, 2018.
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2018-11-07 16:09

@ ElMaestro
Posting: # 19542
Views: 351
 

 Yes it went a little fast

Hi ElMaestro,

IMHO, calculating the S:N is no more than a finger excercise unless required by the method (e.g., traces in accredited environmental analysis where a true blank matrix does not exist). In bioanalytical methods I'm not sure whether it should be part of an (optional anyway) system suitability test. Playing the devil's advocate - two methods, A/P at the LLOQ 20/20%, S/N 10 or 15/15 5?

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
ElMaestro
Hero

Denmark,
2018-11-07 16:22

@ Helmut
Posting: # 19543
Views: 344
 

 Yes it went a little fast

Hi Helmut,

» IMHO, calculating the S:N is no more than a finger excercise unless required by the method (e.g., traces in accredited environmental analysis where a true blank matrix does not exist).

Yes I think this is my point stated in oher words. There is no universally agreed method to define s:n so as a consequence it is hardly effective or relevant to have a rule about its quantity.

» In bioanalytical methods I'm not sure whether it should be part of an (optional anyway) system suitability test. Playing the devil's advocate - two methods, A/P at the LLOQ 20/20%, S/N 10 or 15/15 5?

I don't follow this terminology, can you explain what you mean? My hindsight is 20/20.
S:N is done at the LLOQ, as far as I know. If it is "good enough" at the LLOQ then it is supposedly also "good enough" (whatever that means) anywhere above the LLOQ.

if (3) 4

Best regards,
ElMaestro

"(...) targeted cancer therapies will benefit fewer than 2 percent of the cancer patients they’re aimed at. That reality is often lost on consumers, who are being fed a steady diet of winning anecdotes about miracle cures." New York Times (ed.), June 9, 2018.
nobody
Senior

2018-11-07 18:30

@ ElMaestro
Posting: # 19544
Views: 333
 

 Yes it went a little fast

Formally, the limit of detection (LOD) is defined by the S:N ratio. The peak you can definitely distinguish from noise (Limit of blank) with your method. In bioanalytics the LLOQ is by definition the concentration you can determine with P/A of 20%/+/-20% (by a convention method, i.e spiked samples with n=......).

If you (or anybody else in the room...) think your LOD is higher than the LLOQ, you have a problem, in deed. Otherwise... see Helllcourages statement.

Kindest regards, nobody
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2018-11-08 09:46

@ ElMaestro
Posting: # 19545
Views: 291
 

 Purpose of S:N?

Hi ElMaestro,

» » [...] two methods, A/P at the LLOQ 20/20%, S/N 10 or 15/15 5?
»
» I don't follow this terminology, can you explain what you mean? My hindsight is 20/20.

Sorry, my wording was not clear. I meant: Both methods fullfil the required conditions for (in)accuracy and precision at the LLOQ (20%). The first one is right at the border (20/20) but with a nice S:N of 10. The second one is better in terms of A/C (15/15) despite a worse S:N of 5.
I guess most people would prefer #2. I don't see the purpose of S:N. IMHO, it is nice to know only.

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
ElMaestro
Hero

Denmark,
2018-11-08 22:27

@ Helmut
Posting: # 19549
Views: 273
 

 Purpose of S:N?

Hi Hötzi,

» I don't see the purpose of S:N. IMHO, it is nice to know only.

EMA: " (...) the analyte signal of the LLOQ sample should be at least 5 times the signal of a blank sample"

So, it becomes a choice between doing something meaningful or doing something that appears compliant with the guideline :-D

if (3) 4

Best regards,
ElMaestro

"(...) targeted cancer therapies will benefit fewer than 2 percent of the cancer patients they’re aimed at. That reality is often lost on consumers, who are being fed a steady diet of winning anecdotes about miracle cures." New York Times (ed.), June 9, 2018.
nobody
Senior

2018-11-09 10:06

@ ElMaestro
Posting: # 19552
Views: 237
 

 Purpose of S:N?

Do you evaluate peak height as the signal for your assay? Or peak area?

Kindest regards, nobody
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2018-11-09 11:34

@ nobody
Posting: # 19554
Views: 228
 

 Purpose of S:N?

» Do you evaluate peak height as the signal for your assay? Or peak area?

Mainly the latter. However,

the signal
is the signal
is the signal

- not its integral over time. :-D

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
nobody
Senior

2018-11-09 12:08

@ Helmut
Posting: # 19555
Views: 229
 

 Purpose of S:N?

» » Do you evaluate peak height as the signal for your assay? Or peak area?
»
» Mainly the latter. However,

the signal
» is the signal
» is the signal

- not its integral over time. :-D

Is that somewhere stated, that peak height is the signal? You need a baseline (whatever that is) to establish this signal (as for the area, so both parameters don't fall out of the clear blue sky but might be considered convention methods, as the PO has noticed). Why not using the same signal as used for calculation of P/A, i.e. most often area? OR ratio of area with IS?

Kindest regards, nobody
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2018-11-09 12:34

@ nobody
Posting: # 19556
Views: 223
 

 Purpose of S:N?

» Is that somewhere stated, that peak height is the signal?

Not only the two references I gave above but all about detection in chromatography since the 1960s.

» You need a baseline (whatever that is) to establish this signal (as for the area, so both parameters don't fall out of the clear blue sky but might be considered convention methods, as the PO has noticed). Why not using the same signal as used for calculation of P/A, i.e. most often area? OR ratio of area with IS?

Simple. Given the CDS's settings appropriate to integrate the peak (bunching, smoothing, slope limits for start/end, area threshold) the noise will not be integrated.
BTW, I'm curious how other members of the forum calculate the noise.

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2018-11-09 11:26

@ ElMaestro
Posting: # 19553
Views: 235
 

 Purpose of S:N?

Hi ElMaestro,

» EMA: " (...) the analyte signal of the LLOQ sample should be at least 5 times the signal of a blank sample"

Oh dear, I missed that for years. Biased perception.

» So, it becomes a choice between doing something meaningful or doing something that appears compliant with the guideline :-D

Let me continue my series of methods. Number 3: A/P 10/10% but S:N 3:1. Although I have some doubts whether we can achieve such a nice A/P with such a bad S:N, let it be. I consider this one the "best" of the three. I would never ever use one of the others only cause the S:N is higher.

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
ElMaestro
Hero

Denmark,
2018-11-09 13:16

@ Helmut
Posting: # 19557
Views: 211
 

 Purpose of S:N?

Hi Hötzi,

» Let me continue my series of methods. Number 3: A/P 10/10% but S:N 3:1. Although I have some doubts whether we can achieve such a nice A/P with such a bad S:N, let it be.

Surely you can. Your only sin is that you have picked the wrong arbitrary value of k :-D:-D:-D:-D:-D:-D:-D

if (3) 4

Best regards,
ElMaestro

"(...) targeted cancer therapies will benefit fewer than 2 percent of the cancer patients they’re aimed at. That reality is often lost on consumers, who are being fed a steady diet of winning anecdotes about miracle cures." New York Times (ed.), June 9, 2018.
paulhurleyuk
Junior

NE UK,
2018-11-05 16:27

@ ElMaestro
Posting: # 19531
Views: 528
 

 Isn't it time for an open source chromatographic format

» I think it would be great if an open source format for time series of chromatograms data were defined

As often happens, some might argue we already have too many

JCAMP-DX
AIA (.cdf)
AniML

Hopefully AniML, an open XML based schema for Analytical data gets traction to push adoption.

Paul.
Activity
 Thread view
Bioequivalence and Bioavailability Forum |  Admin contact
18,883 posts in 4,027 threads, 1,271 registered users;
online 30 (1 registered, 29 guests [including 25 identified bots]).

Those who make no mistakes are making the biggest mistakes of all –
they are attempting nothing new.    Anthony de Mello

The BIOEQUIVALENCE / BIOAVAILABILITY FORUM is hosted by
BEBAC Ing. Helmut Schütz
HTML5 RSS Feed