Friday, July 14, 2017

Not my human evolution.

Back in December, I tweeted out some thoughts about a blog post by Jerry Coyne.

A lot of people noticed, including JC (who wrote a few response posts) and his followers (some of whom like to tell me I'm stupid).  And a lot of anthropologists, scientists, and journalists noticed too, and beyond my circles.

A page out of "Think BIG and Kick-Ass
in Business and Life" by President Business
When this happened, I was traveling with my family for the holidays, and only brought my phone, not my laptop. (Despite my laptop being more than a "work" tool, I have fierce work-life opinions so it stayed behind.)

In January, when I got back to my computer, I wrote up something that had been simmering for many years but that started really bubbling when Judd Apatow tweeted out a page on evolution from one of Trump's books, during the campaign. Then it all boiled over with the JC incident.

I knew where I wanted this essay to go and I sent it there as soon as I finished the last sentence. They said yes, then they went to work on editing and finding the right time to post when other stories wouldn't completely overshadow it.

It's up today at the Washington Post. Click on this to go there.

Warning! If you were sitting in our symposium at Evolution 2017 in Portland last month, it may give you déjà vu.

Wednesday, July 12, 2017

Nasty People: Explicit Content

This is a guest post by Sophia Weaver (University of Rhode Island '16, Anthropology + Gender and Women's Studies), the author and illustrator of 
Nasty People: an Illustrated Guide to Understanding Sex. 

Getting in the Mood 

click to enlarge

Sex is often represented as something that is fast and dirty with lots of eager, panting motions, but it isn’t always that way… Getting turned on is a MAJOR part of sex, a part that is very often overlooked. Different things turn different people on and the process is very complicated and personal. While every person is turned on by different things, the biological process of being turned on is basic:

Before actually doing the nasty, a series of events must take place in order for the experience to be the best it can be. Some research suggests that 30 minutes of foreplay leads to the BEST of the BEST sex (aka an encounter that is chock full of climaxes). However, while aiming for the best is great, sub-best sex is still great. If you think about it, if you only ever had the BEST sex, it wouldn’t actually be the best anymore…

It is important to remember that every person is different in terms of what turns them on and how long it takes them to get in the mood. There are some universal differences between men and women, but there is still a vast ocean of variability between all individuals.  Generally speaking, women are looking more for the “ideal personality” while men are looking for the “ideal look.” This makes sense in evolutionary terms: women are, deep, deep down, looking for a quality collaborator while men are, deep, deep down looking for quality ways to spread their seed. Still, there must always be an emphasis on VARIABILITY, everyone is different even though we are basically all the same…

Suggestion: Ask your partner (or partners) what turns them on…just the act of asking might end up being a turn on…

Arousal is, in essence, a process of turning on the on’s (activating the sexual accelerator) and off the off’s (deactivating the sexual breaks). Accomplishing this depends on every individual whose accelerator and breaks are uniquely molded by their explicit and implicit sexual education. Context matters. Context really, really matters. Just imagine you and your partner are trying to get down to business but there is a baby crying in the next room, no amount of romantic music and candlelight is going to make that context sexy. However, yet again, I cannot stress enough how important it is to understand that everyone is different. The right context versus the wrong one varies between individuals. Sometimes individuals under stress have increased sex drives and others might have a dwindled sex drive while stressed…what’s the lesson? COMMUNICATE. Maybe it’s awkward, but trust me, it is so worth it.

Once You’re in the Mood
The stages...
Excitement: Increased heart rate and blood pressure, increased sensitivity, nipple erection, increased odor, lubrication and erection.

Plateau: Continued increase in heart rate and blood pressure, increased breathing rate, and involuntary muscle contractions.

Orgasm: Heart rate, blood pressure and breathing rate reach maximum levels, and involuntary muscle spasms.

Resolution Phase: Return to non-excited state.

Physiological responses to sexual stimuli DO NOT necessarily indicate desire or consent!

The Finale
These days we have a bunch of options for birth control so sex isn’t always about making babies. And even if babies are not your thing, it is pretty cool to see how they get made…
click to enlarge

Thursday, June 22, 2017

Everything is genetic, isn't it?

There is hardly a trait, physical or behavioral, for which there is not at least some familial resemblance, especially among close relatives.  And I'm talking about what is meant when someone scolds you saying, "You're just like your mother!"  The more distant the relatives in terms of generations of separation, the less the similarity.  So you really can resist when told, "You're just like your great-grandmother!" The genetic effects decline in a systematic way with more distant kinship.

The 'heritability' of a trait refers to the relative degree to which its variation is the result of variation in genes, the rest being due to variation in non-genetic factors we call 'environment'.  Heritability is a ratio that ranges from zero when genes have nothing to do with the trait, to 1.0 when all the variation is genetic.  The measure applies to a sample or population and cannot automatically be extended to other samples or populations, where both genetic and environmental variation will be different, often to an unknown extent.

Most quantitative traits, like stature or blood pressure or IQ scores show some amount, often quite substantial, of genetic influence.  It often happens that we are interested in some trait that we think must be produced or affected by genes, but that no relevant factor, like a protein, is known.  The idea arose decades ago that if we could scan the genome, and compare those with different manifestations of the trait, using mapping techniques like GWAS (genomewide association studies), we could identify those sites, genomewide, whose variation in our chosen sample may affect the trait's variation.  Qualitative traits like the presence or absence of a disease (say, diabetes or hypertension), may often be due to the presence of some set of genetic variants whose joint impact exceeds some diagnostic threshold, and mapping studies can compare genotypes in affected cases to unaffected controls to identify those sites.

Genes are involved in everything. . . . .
Many things can affect the amount of similarity among relatives, so one has to try to think carefully about attributing ideas of similarity and cause.  Some traits, like stature (height) have very high heritability, sometimes estimated to be about 0.9, that is, 90% of the variation being due to the effects of genetic variation.  Other traits have much lower heritability, but there's generally familial similarity.  And, that's because we each develop from a single fertilized egg cell, which includes transmission of each of our parent's genomes, plus ingredients provided by the egg (and perhaps to a tiny degree sperm), much of which were the result of gene action in our parents when they produced that sperm or egg (e.g., RNA, proteins).  This is why traits can usually be found to have some heritability--some contribution due to genetic variation among the sampled individuals.  In that sense, we can say that genes are involved in everything.

Understanding the genetic factors involved in disease can be important and laudatory, even if tracking them down is a frustrating challenge.  But because genes are involved in everything, our society also seems to have an unending lust for investigators to overstate the value of their findings or, in particular, to estimate or declaim on the heritability, and hence genetic determination, of the most societally sensitive traits, like sexuality, criminality, race, intelligence, physical abuse and the like.

. . . . . but not everything is 'genetic'!

If the estimated heritability for a trait we care about is substantial, then this does suggest the obvious: genes are contributing to the mechanisms of the trait and so it is reasonable to acknowledge that genetic variation contributes to variation in the trait.  However, the mapping industry implies a somewhat different claim: it is that genes are a major factor in the sense that individual variants can be identified that are useful predictors of the trait of interest (NIH's lobbying machine has been saying we'll be able to predict future disease with 'precision').  There has been little constraint on the types of trait for which this approach, sometimes little more than belief or wishful-thinking, is appropriate.

It is important to understand that our standard measures of genes' relative effect are affected both by genetic variation and environmental lifestyle factors.  That means that if environments were to change, the relative genetic effects, even in the very same individuals, would also change.  But it isn't just environments that change; genotypes change, too, when mutations occur, and as with environmental factors, these change in ways that we cannot  predict even in principle.  That means that we cannot legitimately extrapolate, to a knowable extent, the genetic or environmental factors we observe in a given sample or population, to other, much less to future samples or populations.  This is not a secret problem, but it doesn't seem to temper claims of dramatic discoveries, in regard to disease or perhaps even more for societally sensitive traits.

But let's assume, correctly, that genetic variation affects a trait.  How does it work?  The usual finding is that tens or even hundreds of genome locations affect variation in the test trait.  Yet most of the effects of individual genes are very small or rare in the sample.  At least as important is that the bulk of the estimated heritability remains unaccounted for, and unless we're far off base somehow, the unaccounted fraction is due to the leaf-litter of variants individually too weak or too rare to reach significance.

Often it's also asserted that all the effects are additive, which makes things tractable: for every new person, not part of the study, just identify their variants and add up their estimated individual effects to get the total effect on the new person for whatever publishable trait you're interested in.  That's the predictive objective of the mapping studies.  However, I think that for many reasons one cannot accept that these variable sites' actions are truly additive. The reasons have to with actual biology, not the statistical convenience of using the results to diagnose or predict traits.  Cells and their compounds vary in concentrations per volume (3D), binding properties (multiple dimensions), surface areas (2D) and some in various ways that affect how how proteins are assembled and work, and so on.  In aggregate, additivity may come out in the wash, but the usual goal of applied measures is to extrapolate these average results to prediction in individuals.  There are many reasons to wish that were true, but few to believe it very strongly.

Even if they were really additive, the clearly very different leaf-litter background that together accounts for the bulk of the heritability can obscure the numerical amount of that additivity from sample to sample and person to person.  That is, what you estimated from this sample, may not apply, to an unknowable extent, to the next sample.  If and when it does works, we're lucky that our assumptions weren't too far off.

Of course, the focus and promises from the genetics interests assume that environment has nothing serious to do with the genetic effects.  But it's a major, often by far the major, factor, and it may even in principle be far more changeable than genetic variation.  One would have to say that environmental rather than genetic measures are likely to be, by far, the most important things to change in society's interest.

We regularly write these things here not just to be nay-sayers, but to try to stress what the issues are, hoping that someone, by luck or insight, finds better solutions or different ways to approach the problem that a century of genetics, despite its incredibly huge progress, has not yet done.  What it has done is in exquisite detail to show us what the problems are.

A friend and himself a good scientist in relevant areas, Michael Joyner, has passed on a rather apt suggestion to me, that he says he saw in work by Denis Noble.  We might be better off if we thought of the genome as a keyboard rather than as a code or program.  That is a good way to think about the subtle point that, in the end, yes, Virginia, there really are genomic effects: genes affect every trait....but not every trait is 'genetic'!

Tuesday, June 20, 2017

Spooky action at a (short) distance

Entanglement in physics is about action that seems to transfer some sort of 'information' across distances at speeds faster than that of light.  Roughly speaking (I'm not a physicist!), it is about objects with states that are not fixed in advance, and could take various forms but must differ between them, and that are separated from each other.  When measurement is made on one of them, whatever the result, the corresponding object takes on its opposite state.  That means the states are not entirely due to local factors, and somehow the second object 'knows' what state the first was observed in and takes on a different state.

You can read about this in many places and understand it better than I do or than I've explained it here.  Albert Einstein was skeptical that this could occur, if the speed of light were the fastest possible speed.  So he famously called the findings as they stood at that time "Spooky action at a distance." But the findings have stood many specific tests, and seem to be real, however it happens.

Does life, too, have spooky action? 
I think the answer is: maybe so.  But it is at a very short distance, that within the nuclei of individual cells.  Organisms have multiple chromosomes and many species, like humans, have 2 instances of each (are 'diploid'), one inherited from each parent.  I say 'instances' rather than 'copies', because they are not identical to each other nor to those of the parent that transmitted each of them.  They are perhaps near copies, but mutation always occurs, even among the cells within each of us, so each cell differs from their contemporary somatic fellows and from what we inherited in our single-cell beginnings as a fertilized egg.

Many clever studies over many years have been documenting the 3-dimensional, context-specific conformation, or detailed physical arrangement of chromosomes within cells.  The work is variously known, but one catch-term is chromosome conformation capture, or 3C, and I'll use that here.  Unless or until this approach is shown to be too laden with laboratory artifact (it's quite sophisticated), we'll assume it's more or less right.

The gist of the phenomenon is that (1) a given cell type, under a given set of conditions, is using only a subset of its genes (for my  purposes here this generally means protein-coding genes proper); (2) these active genes are scattered along and between the chromosomes, with intervening inactive regions (genes not being used at the moment); (3) the cell's gene expression pattern can change quickly when its circumstances change, as it responds to environmental conditions, during cell division, etc.; (4) at least to some extent the active regions seem to be clustered physically together in expression-centers in the nucleus; (5) this all implies that there is extensive trans communication, coordinating, and physically juxtaposing, parts within and among each chromosome--there is action at a very short distance.

Even more remarkably, I think, this phenomenon seems somehow robust to speciation because related species have similar functions and similar sets of genes, but often their chromosomes have been extensively rearranged during their evolutionary separation. More than this: each person has different DNA sequences due to mutation, and different numbers of genes due to copy number changes (duplications, deletions); yet the complex local juxtapositions seem to work anyway.  At present this is so complicated, so regular, and so changeable and thus so poorly understood, that I think we can reasonably parrot Einstein and call it 'spooky'.

What this means is that chromosomes are not just randomly floating around like a bowl of spaghetti.   Gene expression (including transcribed non-coding RNAs) is thought to be based on the sequence-specific binding of tens of transcription factors in an expression complex that is (usually) just upstream of the transcribed part.  Since a given cell under given conditions is expressing thousands of condition-specific genes, there must be very extensive interaction or 'communication' in trans, that is, across all the chromosomes. That's because the cell can change its expression set very quickly.

The 3C results show that in a given type of cell under given conditions, the chromosomes are physically very non-randomly arranged, with active centers physically very near or perhaps touching each other.  How this massive plate of apparent-spaghetti even physically rearranges to get these areas together, without getting totally tangled up, yet to be quickly rearrangeable is, to me, spooky if anything in Nature is.  The entanglement, disentanglement, and re-entanglement happens genome wide, which is implicitly what the classical term  'polygenic' essentially recognized related to genetic causation, but is now being documented.

The usual approach of genetics these days is to sequence and enumerate various short functional bits as being coding, regulatory, enhancing, inhibiting, transcribing etc. other parts nearby.  We have long been able to analyze cDNA and decide which parts are being used for protein coding, at least. Locally, we can see why or how this happens, in the sense that we can identify the transcription factors and their binding sites, called promoters, enhancers and the like, and the actual protein or functional RNA codes.  We can find expression correlates by extracting them from cells and enumerating them.  3C analysis appears to show that these coding elements are, at least to some extent, found juxtaposed in various transcription hot-spots.

Is gene expression 'entangled'?
What if the molecular aspects of the 3C research were shown to be technical artifacts, relative to what is really going on?  I have read some skepticism about that, concerning what is found in single cells vs aggregates of 'identical' cells.  If 3C stumbles, will our idea of polygenic condition-specific gene usage change?   I think not.  We needn't have 3C data to show the functional results since they are already there to see (e.g., in cell-specific expression studies--cDNA and what ENCODE has found). If 3C has been misleading for technical or other reasons, it would just mean that something else just as spooky but different from the 3D arrangement that 3C detects, is responsible for correlating the genomewide trans gene usage.  And it's of course 4-dimensional since it's time-dependent, too.  So what I've said here still will apply, even if for some other, unknown or even unsuspected reason.

The existing observations on context-specific gene expression show that something 'entangles' different parts of the genome for coordinated use, and that can change very rapidly.  The same genome, among the different types of cells of an individual, can behave very differently in this sense. Somehow, its various chromosomal regions 'know' how to be, or, better put, are coordinated.  This seems at least plausibly to be more than just that a specific context-specific set of transcription factors (TFs) binds selectively near regions to be transcribed and changes in its thousands of details almost instantly.  What TFs?  and how does a given TF know which binding sites to grab or to release, moment by moment, since they typically bind enhancers or promoters of many different genes, not all of them expression-related.  And if you want to dismiss that, by saying for example that this has to do with which TFs are themselves being produced, or which parts of DNA are unwrapped at each particular time, then you're just bumping the same question about trans control up, or over, to a different level of what's involved.  That's no answer!

And there is even another, seemingly simpler example to show that we really don't understand what's going on: the alignment of homologues in the first stage of meiosis.  We've been taught that empirical and necessary fact about meiosis for many decades. But how do the two homologues find each other to align?  This is essentially just not mentioned, if anyone even was asking, in textbooks.  I've seen some speculative ideas, again involving what I'll call 'electromagnetic' properties of each chromosome but even their authors didn't really claim it was sufficient or definitive.  Just for examples, homologous chromosomes in a diploid individual have different rearrangements, deletions, duplications, and all sorts of heterozygous sequence details, yet by and large they still seem to find each other in meiosis.  Something's going on!

How might this be tested?
I don't have any answers, but I wonder if, on the hypothesis that these thoughts are on target, how we might set up some critical experiments to test this.  I don't know if we can push the analogy with tests for quantum entanglement or not, but probably not.

One might hope that 'all' we have to do is enumerate sequence bits to account for this action-at-a-distance, this very detailed trans phenomenon.  But I wonder......I wonder if there may be something entirely unanticipated or even unknown that could be responsible.  Maybe there are 'electromagnetic' properties or something akin to that, that are involved in such detailed 4D contextually relativistic phenomena.

Suppose that what happens at one chromosomal location (let's just call it the binding of a TF), directly affects whether that or a different TF binds somewhere else at the same time.  Whatever causes the first event, if that's how it works, the distance effect would be a very non-local phenomenon, one so central to organized life in complex organisms that, causally, is not just a set of local gene expressions.  Somehow, some sort of 'information' is at work very fast and over very short distances. It is the worst sort of arrogance to assume it is all just encoded in DNA as a code we can read off along the strand and that will succumb to enumerative local informatic sequence analysis.

The current kind of purely local hypothetical sequence enumeration-based account seems too ordinary--it's not spooky enough!

Monday, June 19, 2017

More thoughts, and just plain provocateuring, on genomic causal complexity. . . .

Here are some follow-up reflections on my recent post about GWAS and kindred methods and claims.  I know I'm being contentious, but science has always been contentious.  However, socioeconomic issues (careers, salaries, etc) also enter the picture in a way that is relevant to the inertial nature of our profession.  Readers who haven't read Ludwik Fleck's 1930's volume on 'thought collectives', one preceding Kuhn's 'normal science/paradigm' discussion, should do that, because it's relevant to where we stand now.

The causal complexity of genetic control of quantitative traits was in principle understood by Fisher and others almost exactly century ago.  The development of mapping tools opened the door to seeing what that meant more specifically, at the genome level.

Some key facts about this, I think, are that when there is a strong single signal, we see segregation in families (when there are enough families, as there were in Utah for BRCA mapping), or some other indicator (detectable deletion chromosome detection in Wilm's tumor and perhaps something similar in Retinoblastoma).  Those were families and mainly monogenic in the Mendelian sense (that is, of the traits Mendel carefully chose to study for their simple states).

But BRCA and I think for different reasons, retinitis pigmentosa, mapping by association rather than families doesn't find these genes 'for' the trait.  They're individually strong, but relatively minor on a population and hence association-mapping sense.  And, in nearly all cases, even with 'single locus' diseases, once the gene is known, we see genotypic complexity, including often very low 'penetrance' (showing that 'the' gene isn't a single-locus cause by itself).

BRCA-associated breast cancer risk, once the gene was known and could specifically be typed, is very different even among women carrying known high-risk BRCA1/2 alleles, depend on cohort and the study.  The purported single-locus Hemochromatosis gene (HFE) mutations are associated with high risk in the original sample, in Utah if my recollection is correct, but the mutation does not cause the disease in other samples.  Even the classic PKU is not always caused by PAH alleles, not all pathogenic PAH alleles cause PKU.  Ditto for CF and the CFTR gene.  In some cases, at least, it is likely other interacting genes that in particular populations lead the target gene to seem causal in a Mendelian-like sense.

And of course there is now a substantial literature showing that individuals carrying dead (non-activated) disease-causing genes are walking around without the disease.  I think estimates have shown that each of us carries many (100 or so?) such genes, at least some if not all of which are diploid-negative.  If this doesn't suggest pervasive redundancy and the mappability problems I and others have written about, what does it suggest?

I will once again utter the apparently off-color factor that few want to acknowledge or say in mixed company: somatic mutation. Enough said on that black-box subject.

And while invoking the Truth's name in vain, I'll just whisper here another off-color word: environment.  Enough said on that black-box subject, too.

And there is the non-reductionistic 4th dimension of genetic causation in cells, which is being studied by chromosomal conformation methods (3C and its variants).  What this will lead to is unclear, to me anyway, but clearly there are extensive trans phenomena that methods for sequence parsing and enumerating methods, par for the course now for many decades, are not solving.  If they were working, we wouldn't need a plethora of new terms, and gilded promises from on high (i.e., NIH).

I've often mentioned that much of what we do relies on statistical inference.  That's been getting a well-deserved bad name, but rest assure that the SAS and SPSS people will guarantee you that their packages or use-instructions have been fixed so they won't lead you astray any further.  Nonetheless, there is this third little secret: statistical methods in this arena assume various aspects of replicability while adaptive evolution is fundamentally about non-replicability.

In any case, estimating risk-factor (causal SNP) effects retrospectively is data-fitting and not, in itself, related to cause or prediction, much less doing so with 'precision'.  Such extrapolation rests on the assumption that past fractions mean future probability, which is critical here (especially when sampling, environment, mutation, somatic mutation etc. are inherently unpredictable and essentially non-replicable).

And is it too identity-political to mention that there is the unseemly fact that most of this intensive mapping work has been done on Europeans for the sometimes even openly acknowledged rationale that Europeans have the moolah to pay for the gene-targeted drugs that Pharma has been promising for decades of the genome era? In any case, that's mere racism relative to the deeper genetic-causal issues themselves.  Even restrictive sampling doesn't guarantee replicability; a point I won't mention again lest I be accused of being as repetitive as someone doing GWAS on obesity.

These are just the simple issues one can conjure up without even doing any PubMed searching.  What amount of hammering does it take to get the message to sink in?  By sinking in I mean not just being noted, briefly and in passing, but to force some change of approach beyond enumerating, random sampling, and cachet marketing words (like gene regulatory networks, precision genomic medicine, omingenic, and all the 'omics'-du-jour, etc.).

I would want to be clear even for those who wish to trash all my thoughts: Go ahead!  But at least acknowledge, as I acknowledged in my previous post, that the mapping era did do us great service by providing, for the first time, some specific sense of the genomic details underlying life's causal complexity and showing that in a general sense the original polygenic model was basically right.  Family studies are better when some really meaningfully single strong factor is at work, but the use of IBD assumptions to do association mapping cast, like a flashlight in the dark, light upon what had perforce remained dark to our understanding.  But it's now been quite a while that we have had the understanding we need to know that we should think of different ways to approach the subject of life's causation.  The flashlight's batteries are fading.

And here's my bit of sympathy for what is going on.  Complementing the complexity landscape that is the obvious reality are the key facts underlying all of this: scientists are people and, including yours truly, have limited abilities and can't just facilely be slammed for their not accounting for everything perfectly and immediately.  We're people who, mainly, need salaries, facilities in which to work, and employers like universities who these days have to operate in the black.  These are the deeply socioeconomic underlying problems that serve to encourage or even to force safe science, big science, and oversold science.  That the news media and other vested interests compound the felony is simply one of the problems of our type of imperfect society.

Moving the Big Money that has been locked up by the current haves, to redistribute to more important-payoff kinds of research would inevitably meet resistance, including from NIH's head office, which has been a sloganeering center that makes PT Barnum look like an amateur.  Whether or how or how much redirection of funding, which is what's actually at the unstated core of much of the controversy, is obviously not predictable.  But the importance of trying is what motivates my perhaps too-often and too-cranky posts:  Somebody has to speak of the Emperor's clothes!

Until we fix these underlying issues, whatever mess our current thrust is embedding us in, they will persist until some lucky day when an actually better idea stumbles upon the stage.

Saturday, June 17, 2017

The GWAS hoax....or was it a hoax? Is it a hoax?

A long time ago, in 2000, in Nature Genetics, Joe Terwilliger and I critiqued the idea then being pushed by the powers-that-be, that the genomewide mapping of complex diseases was going to be straightforward, because of the 'theory' (that is, rationale) then being proposed that common variants caused common disease.  At one point, the idea was that only about 50,000 markers would be needed to map any such trait in any global populations.  I and collaborators can claim that in several papers in prominent journals, in a 1992 Cambridge Press book, Genetic Variation and Human Disease, and many times on this blog we have pointed out numerous reasons, based on what we know about evolution, why this was going to be a largely empty promise.  It has been inconvenient for this message to be heard, much less heeded, for reasons we've also discussed in many blog posts.

Before we get into that, it's important to note that unlike me, Joe has moved on to other things, like helping Dennis Rodman's diplomatic efforts in North Korea (here, Joe's shaking hands as he arrives in his most recent trip).  Well, I'm more boring by far, so I guess I'll carry on with my message for today.....

There's now a new paper, coining a new catch-word (omnigenic), to proclaim the major finding that complex traits are genetically complex.  The paper seems solid and clearly worthy of note.  The authors examine the chromosomal distribution of sites that seem to affect a trait, in various ways including chromosomal conformation.  They argue, convincingly, that mapping shows that complex traits are affected by sites strewn across the genome, and they provide a discussion of the pattern and findings.

The authors claim an 'expanded' view of complex traits, and as far as that goes it is justified in detail. What they are adding to the current picture is the idea that mapped traits are affected by 'core' genes but that other regions spread across the genome also contribute. In my view the idea of core genes is largely either obvious (as a toy example, the levels of insulin will relate to the insulin gene) or the concept will be shown to be unclear.  I say this because one can probably always retroactively identify mapped locations and proclaim 'core' elements, but why should any genome region that affects a trait be considered 'non-core'?

In any case, that would be just a semantic point if it were not predictably the phrase that launched a thousand grant applications.  I think neither the basic claim of conceptual novelty, nor the breathless exploitive treatment of it by the news media, are warranted: we've known these basic facts about genomic complexity for a long time, even if the new analysis provides other ways to find or characterize the multiplicity of contributing genome regions.  This assumes that mapping markers are close enough to functionally relevant sites that the latter can be found, and that the unmappable fraction of the heritability isn't leading to over-interpretation of what is 'mapped' (reached significance) or that what isn't won't change the picture.

However, I think the first thing we really need to do is understand the futility of thinking of complex traits as genetic in the 'precision genomic medicine' sense, and the last thing we need is yet another slogan by which hands can remain clasped around billions of dollars for Big Data resting on false promises.  Yet even the new paper itself ends with the ritual ploy, the assertion of the essential need for more information--this time, on gene regulatory networks.  I think it's already safe to assure any reader that these, too, will prove to be as obvious and as elusively ephemeral as genome wide association studies (GWAS) have been.

So was GWAS a hoax on the public?
No!  We've had a theory of complex (quantitative) traits since the early 1900s.  Other authors argued similarly, but RA Fisher's famous 1918 paper is the typical landmark paper.  His theory was, simply put, that infinitely many genome sites contribute to quantitative (what we now call polygenic) traits.  The general model has jibed with the age-old experience of breeders who have used empirical strategies to improve crop, or pets species.  Since association mapping (GWAS) became practicable, they have used mapping-related genotypes to help select animals for breeding; but genomic causation is so complex and changeable that they've recognized even this will have to be regularly updated.

But when genomewide mapping of complex traits was first really done (a prime example being BRCA genes and breast cancer) it seemed that apparently complex traits might, after all, have mappable genetic causes. BRCA1 was found by linkage mapping in multiply affected families (an important point!), in which a strong-effect allele was segregating.  The use of association mapping  was a tool of convenience: it used random samples (like cases vs controls) because one could hardly get sufficient multiply affected families for every trait one wanted to study.  GWAS rested on the assumption that genetic variants were identical by descent from common ancestral mutations, so that a current-day sample captured the latest descendants of an implied deep family: quite a conceptual coup based on the ability to identify association marker alleles across the genome identical by descent from the un-studied shared remote ancestors.

Until it was tried, we really didn't know how tractable such mapping of complex traits might be. Perhaps heritability estimates based on quantitative statistical models was hiding what really could be enumerable, replicable causes, in which case mapping could lead us to functionally relevant genes. It was certainly worth a try!

But it was quickly clear that this was in important ways a fool's errand.  Yes, some good things were to be found here and there, but the hoped-for miracle findings generally weren't there to be found. This, however, was a success not a failure!  It showed us what the genomic causal landscape looked like, in real data rather than just Fisher's theoretical imagination.  It was real science.  It was in the public interest.

But that was then.  It taught us its lessons, in clear terms (of which the new paper provides some detailed aspects).  But it long ago reached the point of diminishing returns.  In that sense, it's time to move on.

So, then, is GWAS a hoax?
Here, the answer must now be 'yes'!  Once the lesson is learned, bluntly speaking, continuing on is more a matter of keeping the funds flowing than profound new insights.  Anyone paying attention should by now know very well what the GWAS etc. lessons have been: complex traits are not genetic in the usual sense of being due to tractable, replicable genetic causation.  Omnigenic traits, the new catchword, will prove the same.

There may not literally be infinitely many contributing sites as in the original statistical models, be they core or peripheral, but infinitely many isn't so far off.  Hundreds or thousands of sites, and accounting for only a fraction of the heritability means essentially infinitely many contributors, for any practical purposes.  This is particularly so since the set is not a closed one:  new mutations are always arising and current variants dying away, and along with somatic mutation, the number of contributing sites is open ended, and not enumerable within or among samples.

The problem is actually worse.  All these data are retrospective statistical fits to samples of past outcomes (e.g., sampled individuals' blood pressures, or cases' vs controls' genotypes).  Past experience is not an automatic prediction of future risk.  Future mutations are not predicable, not even in principle.  Future environments and lifestyles, including major climatic dislocations, wars, epidemics and the like are not predictable, not even in principle.  Future somatic mutations are not predictable, not even in principle.

GWAS almost uniformly have found (1) different mapping results in different samples or populations, (2) only a fraction of heritability is accounted for by tens, hundreds, or even thousands of genome locations and (3) even relatively replicable 'major' contributors, themselves usually (though not always) small in their absolute effect, have widely varying risk effects among samples.

These facts are all entirely expectable based on evolutionary considerations, and they have long been known, both in principle, indirectly, and from detailed mapping of complex traits.  There are other well-known reasons why, based on evolutionary considerations, among other things, this kind of picture should be expected.  They involve the blatantly obvious redundancy in genetic causation, which is the result of the origin of genes by duplication and the highly complex pathways to our traits, among other things.  We've written about them here in the past.  So, given what we now know, more of this kind of Big Data is a hoax, and as such, a drain on public resources and, perhaps worse, on the public trust in science.

What 'omnigenic' might really mean is interesting.  It could mean that we're pressing up ever more intensely against the log-jam of understanding based on an enumerative gestalt about genetics.  Ever more detail, always promising that if we just enumerate and catalog just a bit (in this case, the authors say we need to study gene regulatory networks) more we'll understand.  But that is a failure to ask the right question: why and how could every trait be affected by every part of the genome?  Until someone starts looking at the deeper mysteries we've been identifying, we won't have the transormative insight that seems to be called for, in my view.

To use Kuhn's term, this really is normal science pressing up against a conceptual barrier, in my view. The authors work the details, but there's scant hint they recognize we need something more than more of the same.  What is called for, I think is young people who haven't already been propagandized about the current way of thinking, the current grantsmanship path to careers.

Perhaps more importantly, I think the situation is at present an especially cruel hoax, because there are real health problems, and real, tragic, truly genetic diseases that a major shift in public funding could enable real science to address.

Saturday, June 3, 2017

The real reason Graham Spanier's going to jail

This post will seem to be a serious diversion from our usual topics, but in another sense it is actually of the same sort.  There's a lesson to be learned from Penn State's recent inglorious history, or perhaps a 'meta-lesson', and since we are at Penn State, it's perhaps an appropriate context for us.

Our former President, Graham Spanier, was just yesterday sentenced to some jail time for his conviction on charges related to negligent response to reports of Jerry Sandusky's child abuse. Spanier and two other high officials at Penn State were convicted of essentially turning their heads away from reports of abuse.  They have essentially acknowledged this, though Dr Spanier still seems to be wriggling, unconvinced that, despite the tragedy for many abused boys, he looked the other way when action could have been taken.  All the evidence we see in the news, at least, suggests that the administrators did in fact do that, as did rightfully legendary football coach Joe Paterno.

They all didn't want to know unpleasant things, or conveniently didn't see them.  This doesn't mean they were bad people who wanted little boys to be abused.  It means they didn't respond to some indirect information they received about possible abuse going on in a Penn State athletic facility.  In a sense this was the most convenient response, rather than be bothered by something that they, after all, didn't see directly and that would at best be a distraction from their busy lives.

But there is a far deeper part of the story, and it has to do with why this happened. Seeing that bigger picture is the only way to understand what has happened.  The reason has to do with the nature of our society generally and how it applies to universities.  Our former President is going to jail because he was part of a system that favors money, convenience, and appearance over ethics.

The Spin Society
As university president, Graham Spanier seemed to have an insatiable hunger for attention.  He was in front of a camera all the time and everywhere.  He was a very good spinner and fundraiser.  The University's coffers grew admirably during his administration.  He became prominent in many ways, none of them intellectual or very seriously about education.

Penn State grew in size, even though it's hard argue that bigger classes or athletic programs were about serious-level education.  Dr Spanier didn't impede good faculty hiring or strong research. Indeed, he approved of it and helped.  But his administration was not an intellectual one or a drive for higher educational quality or more rigor per se.  More research could be counted in terms of dollars flowing in and publications flowing out.  If it happened, or if he could help, that was terrific.

But Graham Spanier's downfall was because he was part of the Spin Society.  His presidency was focused on image, and image was based on money.  Uncomfortable facts were dealt with quietly, or brushed aside.  Social activism was ignored or given a patronizing pat on the head.  Anything that was good for image, was good for  Penn State.

Graham Spanier is going to jail because he was a cog in this wheel, a wheel that is the essence of contemporary American society.  In the Spin Society, competition for resources is the bottom line, more the operant word, and cardboard cutouts the easiest approach to take.  Boards of Trustees expect it (and may not tolerate Presidents who actually try to do something to shake the system itself).  Since Boards are responsible for the tenor of the university and the tenure of its leaders, maybe it is they who should be behind bars, not the agent whom they encouraged and rewarded for 15+ years.

If we see Spanier's sentence as a just reward for a negligent or miscreant, then we are misunderstanding and complicit in worse that is likely to come.  If we see this as a Penn State problem, then we are complicit in allowing the rest of the country to carry on business as has become usual.

A 1952 French movie called Nous Sommes Tous des Assassins, or We Are All Assassins, was the story of a murderer on death row who was being personally held responsible and punished, even though his acts were more deeply the result of the nature of society, for which we are all responsible.

We do not need an increaasingly spin-driven society, where essential dishonesty is at the core of how we operate. There surely are other, better ways to live.

And in a case like this one at Penn State, who is really the guilty party?

Friday, June 2, 2017

Allegiance to the Earth: The Environmentalism Pledge

For her final project in Anthropology/Biology 282G Sapiens: The Changing Nature of Human Evolution, recent University of Rhode Island graduate Marisa DeCollibus created something wonderful.

During her studies at URI, she gained expertise in psychology, learning, and education. From that vantage, she wrote a pledge of allegiance to the earth to be recited daily by K-12 students.

In the companion paper she writes, "Today's Kindergarteners are tomorrow's impactful Sapiens ..." and "I tried to create imagery that invoked being a part of a collective landscape instead of being the rulers of the landscape..."

Here's the pledge.

The Environmentalism Pledge
by Marisa DeCollibus

I pledge to care for the natural world
Of all living and nonliving formations
And to the resources
Of which we share
One planet
Amongst stars
With intention
And effort
For all

Please share widely and please let us know if you think it's just as wonderful as we do and, especially, if you begin reciting this with your children. If you'd like to get in touch with Marisa, please let us know.

Friday, May 5, 2017

Eugenics, such an old-fashioned idea

It's the age of genetics.  Billions of dollars have been spent on identifying genes for important traits like diseases, fun traits like ear wax type and hair color, politically-charged traits like who we vote for and whether we're criminals, and much more.  "Precision Medicine," the idea that with bigger and better genetic data we'll be able to predict future diseases and then, presumably, prevent them, is au courant, and well-funded.

The assumption that genes determine not only our disease futures but our personalities, our preferences, and our behavior, appeals to a lot of people; some of us are naturally good and some of us are naturally bad.  And this has lead many of us to worry about the return of eugenics, the darwinian idea that populations can be improved by controlled reproduction.  That is, that we control their reproduction.  Those of us who are naturally bad just shouldn't be allowed to reproduce.  This was an idea that early 20th century America translated into the forced sterilization of the intellectually or socially inferior other, and that the Nazis translated (in many ways copying our lead) into mass murder of anyone they didn't like.

It turns out, though, that the worry about eugenics is now out-of-date.  It's too finely-honed a tool. The Republican majority in the US House of Representatives, with the enthusiastic support of our 45th president, has just passed a bill to repeal and replace the Affordable Care Act (ACA), President Obama's signature program to expand affordable access to medical care to 45 million people who had no health insurance, and to those for whom it was prohibitively expensive.

One of the most humane, important, and best-liked provisions of the ACA was that it did not allow insurers to discriminate against people who had "pre-existing conditions", illnesses that preceded their insurance coverage.  Insurance companies don't like to have to cover sick people because they cost money.  Fair enough, I suppose, given that insurers are businesses, not philanthropies, and have to make a profit (unlike a civilized country's national healthcare system, which is by and for the people rather than the plutocrats).  But, this is how all insurance works, car, home, flood and otherwise -- we all pay in, some of us cost more than we pay in, and some of us cost less.  If it's only sick people, or bad drivers, or people in hurricane zones who buy insurance, insurance companies would all quickly be out of business, which of course is why we all are required to buy car and home insurance.

But it turns out that there are good moral reasons to discriminate against people with pre-existing conditions -- according to a member of the Republican, white, male "Freedom Caucus", the extreme, and let's be honest, extremely ill-informed right-wingers in the House, pre-existing conditions don't happen to people who live good lives.  (Funny how their new list of pre-existing conditions includes pregnancy, rape, sexual harassment, breast cancer, among many other things, but not erectile dysfunction or prostate cancer. Nice discussion of this topic here.)

To ensure that covering actual sick people was going to be affordable, the ACA mandated that everyone have health insurance.  The political right never liked this provision of the law -- depending on your reading, this was due to the libertarian view that governments shouldn't be able to require that we do anything, or because they didn't want their money covering them, or perhaps a toxic mix of both -- and they've been fighting it ever since.  It's long been clear that that have no idea why a mandate was essential.  Because, who knew that health care was so complicated?

As is well known, the Republicans voted at least a zillion times to repeal the ACA while Obama was president.  Finally, yesterday, under the caring leadership of our current president, the Republican-led House passed a repeal-and-replace bill that would essentially eliminate protection for people with pre-existing conditions, as well as the requirement that healthy people purchase insurance.  And, in an ugly and cynical move that makes abundantly clear the racist and other lies behind this bill, they voted to exempt themselves from its new constraints (of course, because they're the good guys!).

This bill is bad medicine.  But that's irrelevant to Republicans and their supporters.  It's not meant to be much more than a tax cut for the rich (protecting wealth being the only core tenet of that party). And a thumb in the eye of anyone who benefitted from the ACA; the poor, the sick, and the Democrats.  It will definitely be a money saver, when 24 million people lose coverage, and then die of things that those with money don't have to die of.  As Jimmy Kimmel said in his emotional defense of insurance for all.

And this is what brings us back to eugenics.  Who needs the kind of very expensive, targeted precision promised by knowledge of genes to cherrypick those who should live and those who should die?  Let's just take away access to medical care from all of Them.  And make our country great for the oligarchs again.

Thursday, April 27, 2017

The Law of No Restraint

There's a new law of science reporting or, perhaps more accurately put, of the science jungle.  The law is to feed any story, no matter how fantastic, to science journalists (including your university's PR spinners), and they will pick up whatever can be spun into a Big Story, and feed it to the eager mainstream media.  Caveats may appear somewhere in the stories, but not the headlines so that, however weak or tentative or incredible, the story gets its exposure anyway.  Then on to tomorrow's over-sell.

One rationale for this is that unexpected findings--typically presented breathlessly as 'discoveries'--sell: they rate the headline. The caveats and doubts that might un-headline the story may be reported as well, but often buried in minimal terms late in the report.  Even if the report balances skeptics and claimants, simply publishing the story is enough to give at least some credence to the discovery.

The science journalism industry is heavily inflated in our commercial, 24/7 news environment. It would be better for science, if not for sales, if all these hyped papers, rather than being publicized at the time the paper is published, first appeared in musty journals for specialists to argue over, and in the pop-sci news only after some mature judgments are made about them.  Of course, that's not good for commercial or academic business.

We have just seen a piece reporting that humans were in California something like 135,000 years ago, rather than the well-established continental dates of about 12,000.  The report which I won't grace by citing here, and you've probably seen it anyway, then went on to speculate about what 'species' of our ancestors these early guys might have been.

Why is this so questionable?  If it were a finding on its own, it might seem credible, but given the plethora of skeletal and cultural archeological findings, up and down the Americas, such an ancient habitation seems a stretch.  There is no comparable trail of earlier settlements in northeast Asia or Alaska that might suggest it, and there are lots of animal and human archeological remains--all basically consistent with each other, so why has no earlier finding yet been made?  It is of course possible that this is the first and is a correct one, but it is far too soon for this to merit a headline story, even with caveats.

Another piece we saw today reported that a new analysis casts doubt on whether diets high in saturated fat are bad for you.  This was a meta-analysis of various other studies that have been done, and got some headline treatment because the authors report that, contrary to many findings over many years, saturated fats don't clog arteries. Instead, they say, coronary heart disease is a chronic inflammatory condition.  Naturally, the study's basic data are being challenged, as reflected in this story's discussion, by critiques of its data and method.  These get into details we're not qualified to judge, and we can't comment on the relative merits of the case.

However, one thing we can note is that with respect to coronary heart disease, study after study has reported more or less the same, or at least consistent findings about the correlation between saturated fats and risk. Still, despite so very much careful science, including physiological studies as well as statistical analysis of population samples, can we still apparently not be sure about a dietary component that we've been told for years should play a much reduced role in what we eat?  How on earth could we possibly still not know about saturated fat diets and disease risk?

If this very basic issue is unresolved after so long, and the story is similar for risk factors for many complex diseases, then what is all this promise of 'precise' medicine all about?  Causal explanations are still fundamentally unclear for many cancers, dementias, psychiatric disorders, heart disease, and so on.  So why isn't the most serious conclusion that our methods and approaches themselves are for some reason simply not adequate to answer such seemingly simple questions as 'is saturated fat bad for you?'  Were the plethora of previous studies all flawed in some way?  Is the current study?  Do the publicizing of the studies themselves change behaviors in ways that affects future studies?

There may be no better explanation than that diets and physiology are hard to measure and are complex, and that no simple answer is true.  We may all differ for genetic and other reasons to such an extent that population averages are untrustworthy, or our habits may change enough that studies don't get consistent answers.  Or asking about one such risk factor when diets and lifestyles are complex is a science modus operandi that developed for studying simpler things (like exposure to toxins or bacteria, the basis of classical epidemiology), and we simply need a better gestalt from which to work.

Clearly a contributory sociological factor is that the science industry has simply been cruising down the same rails despite constant popping of promise bubbles, for decades now.  It's always more money for more and bigger studies.  It's rarely let's stop and take a deep breath and think of some better way to understand (in this case) dietary relationships to physical traits.  In times past, at least, most stories like the ancient Californian didn't get ink so widely and rapidly.  But if I'm running a journal, or a media network, or am a journalist needing to earn my living, and I need to turn a buck, naturally I need to write about things that aren't yet understood.

Unfortunately, as we've noted before, the science industry is a hungry beast that needs its continual feeding, and (like our 3 cats) always demands more, more, and more.  There are ways we could reform things, at least up to a point.  We'll never end the fact that some scientists will claim almost anything to get attention, and we'll always be faced with data that suggest one thing that doesn't turn out that way.  But we should be able to temper the level of BS and get back more to sober science rather than sausage factory 'productivity'.  And educate the public that some questions can't be answered the way we'd like, or aren't being asked in the right way.  But that is something science might address effectively, if it weren't so rushed and pressured to 'produce'.

Thursday, April 20, 2017

Some genetic non-sense about nonsense genes

The April 12 issue of Nature has a research report and a main article about what is basically presented as the discovery that people typically carry doubly knocked-out genes, but show no effect. The idea as presented in the editorial (p 171) notes that the report (p235) uses an inbred population to isolate double knockout genes (that is, recessive homozygous null mutations), and look at their effects.  The population sampled, from Pakistan, has high levels of consanguineous marriages.  The criteria for a knockout mutation was based on the protein coding sequence.

We have no reason to question the technical accuracy of the papers, nor their relevance to biomedical and other genetics, but there are reasons to assert that this is nothing newly discovered, and that the story misses the really central point that should, I think, be undermining the expensive Big Data/GWAS approach to biological causation.

First, for some years now there have been reports of samples of individual humans (perhaps also of yeast, but I can't recall specifically) in which both copies of a gene appear to be inactivated.  The criteria for saying so are generally indirect, based on nonsense, frameshift, or splice-site mutations in the protein code.  That is, there are other aspects of coding regions that may be relevant to whether this is a truly thorough search to see that whatever is coded really is non-functional.  The authors mention some of these.  But, basically, costly as it is, this is science on the cheap because it clearly only addresses some aspects of gene functionality.  It would obviously be almost impossible to show either that the gene was never expressed or never worked. For our purposes here, we need not question the finding itself.  The fact that this is not a first discovery does raise the question why a journal like Nature is so desperate for Dramatic Finding stories, since this one really should be instead a report in one of many specialty human genetics journals.

Secondly, there are causes other than coding mutations for gene inactivation. They have to do with regulatory sequences, and inactivating mutations in that part of a gene's functional structure is much more difficult, if not impossible, to detect with any completeness.  A gene's coding sequence itself may seem fine, but its regulatory sequences may simply not enable it to be expressed. Gene regulation depends on epigenetic DNA modification as well as multiple transcription factor binding sites, as well as the functional aspects of the many proteins required to activate a gene, and other aspects of the local DNA environment (such as RNA editing or RNA interference).  The point here is that there are likely to be many other instances of people with complete or effectively complete double knockouts of genes.

Thirdly, the assertion that these double KOs have no effect depends on various assumptions.  Mainly, it assumes that the sampled individuals will not, in the future, experience the otherwise-expected phenotypic effects of their defunct genes.  Effects may depend on age, sex, and environmental effects rather than necessarily being a congenital yes/no functional effect.

Fourthly, there may be many coding mutations that make the protein non-functional, but these are ignored by this sort of study because they aren't clear knockout mutations, yet they are in whatever data are used for comparison of phenotypic outcomes.  There are post-translational modification, RNA editing, RNA modification, and other aspects of a 'gene' that this is not picking up.

Fifthly, and by far most important, I think, is that this is the tip of the iceberg of redundancy in genetic functions.  In that sense, the current paper is a kind of factoid that reflects what GWAS has been showing in great, if implicit, detail for a long time: there is great complexity and redundancy in biological functions.  Individual mapped genes typically affect trait values or disease risks only slightly.  Different combinations of variants at tens, hundreds, or even thousands of genome sites can yield essentially the same phenotype (and here we ignore the environment which makes things even more causally blurred).

Sixthly, other samples and certainly other populations, as well as individuals within the Pakistani data base, surely carry various aspects of redundant pathways, from plenty of them to none.  Indeed, the inbreeding that was used in this study obviously affects the rest of the genome, and there's no particular way to know in what way, or more importantly, in which individuals.  The authors found a number of basically trivial or no-effect results as it is, even after their hunt across the genome. Whether some individuals had an attributable effect of a particular double knockout is problematic at best.  Every sample, even of the same population, and certainly of other populations, will have different background genotypes (homozygous or not), so this is largely a fishing expedition in a particular pond that cannot seriously be extrapolated to other samples.

Finally, this study cannot address the effect of somatic mutation on phenotypes and their risk of occurrence.  Who knows how many local tissues have experienced double-knockout mutations and produced (or not produced) some disease or other phenotype outcome.  Constitutive genome sequencing cannot detect this.  Surely we should know this very inconvenient fact by now!

Given the well-documented and pervasive biological redundancy, it is not any sort of surprise that some genes can be non-functional and the individual phenotypically within a viable, normal range. Not only is this not a surprise, especially by now in the history of genetics, but its most important implication is that our Big Data genetic reductionistic experiment has been very successful!  It has, or should have, shown us that we are not going to be getting our money's worth from that approach.  It will yield some predictions in the sense of retrospective data fitting to case-control or other GWAS-like samples, and it will be trumpeted as a Big Success, but such findings, even if wholly correct, cannot yield reliable true predictions of future risk.

Does environment, by any chance, affect the studied traits?  We have, in principle, no way to know what environmental exposures (or somatic mutations) will be like.  The by now very well documented leaf-litter of rare and/or small-effect variants plagues GWAS for practical statistical reasons (and is why usually only a fraction of heritability is accounted for).  Naturally, finding a single doubly inactivated gene may, but by no means need, yield reliable trait predictions.

By now, we know of many individual genes whose coded function is so proximate or central to some trait that mutations in such genes can have predictable effects.  This is the case with many of the classical 'Mendelian' disorders and traits that we've known for decades.  Molecular methods have admirably identified the gene and mutations in it whose effects are understandable in functional terms (for example, because the mutation destroys a key aspect of a coded protein's function).  Examples are Huntington's disease, PKU, cystic fibrosis, and many others.

However, these are at best the exceptions that lured us to think that even more complex, often late-onset traits would be mappable so that we could parlay massive investment in computerized data sets into solid predictions and identify the 'druggable' genes-for that Big Pharma could target.  This was predictably an illusion, as some of us were saying long ago and for the right reasons.  Everyone should know better now, and this paper just reinforces the point, to the extent that one can assert that it's the political economic aspects of science funding, science careers, and hungry publications, and not the science itself, that leads to the persistence of drives to continue or expand the same methods anyway.  Naturally (or should one say reflexively?), the authors advocate a huge Human Knockout Project to study every gene--today's reflex Big Data proposal.**

Instead, it's clearly time to recognize the relative futility of this, and change gears to more focused problems that might actually punch their weight in real genetic solutions!

** [NOTE added in a revision.  We should have a wealth of data by now, from many different inbred mouse and other animal strains, and from specific knockout experiments in such animals, to know that the findings of the Pakistani family paper are to be expected.  About 1/4 to 1/3 of knockout experiments in mice have no effect or not the same effect as in humans, or have no or different effect in other inbred mouse strains.  How many times do we have to learn the same lesson?  Indeed, with existing genomewide sequence databases from many species, one can search for 2KO'ed genes.  We don't really need a new megaproject to have lots of comparable data.]

Wednesday, April 12, 2017

Reforming research funding and universities

Any aspect of society needs to be examined on a continual basis to see how it could be improved.  University research, such as that which depends on grants from the National Institutes of Health, is one area that needs reform. It has gradually become an enormous, money-directed, and largely self-serving industry, and its need for external grant funding turns science into a factory-like industry, which undermines what science should be about, advancing knowledge for the benefit of society.  

The Trump policy, if there is one, is unclear, as with much of what he says on the spur of the moment. He's threatened to reduce the NIH budget, but he's also said to favor an increase, so it's hard to know whether this represents whims du jour or policy.  But regardless of what comes from on high, it is clear to many of us with experience in the system that health and other science research has become very costly relative to its promise and too largely mechanical rather than inspired.

For these reasons, it is worth considering what reforms could be taken--knowing that changing the direction of a dependency behemoth like NIH research funding has to be slow because too many people's self-interests will be threatened--if we were to deliver in a more targeted and cost-efficient way on what researchers promise.  Here's a list of some changes that are long overdue.  In what follows, I have a few FYI asides for readers who are unfamiliar with the issues.

1.  Reduce grant overhead amounts
FYI:  Federal grants come with direct and indirect costs.  Direct costs pay the research staff, the supplies and equipment, travel and collecting data and so on.  Indirect costs are worked out for each university, and are awarded on top of the direct costs--and given to the university administrators.  If I get $100,000 on a grant, my university will get $50,000 or more, sometimes even more than $100K.  Their claim to this money is that they have to provide the labs, libraries, electricity, water, administrative support and so on, for the project, and that without the project they'd not have these expenses. Indeed, an indicator of the fat that is in overhead is that as an 'incentive' or 'reward', some overhead is returned as extra cash to the investigator who generated it.]

University administrations have notoriously been ballooning.  Administrators and their often fancy offices depend on individual grant overhead, which naturally puts intense pressure on faculty members to 'deliver'.  Educational institutions should be lean and efficient. Universities should pay for their own buildings and libraries and pare back bureaucracy. Some combination of state support, donations, and bloc grants could be developed to cover infrastructure, if not tied to individual projects or investigators' grants. 

2.  No faculty salaries on grants
FYI:  Federal grants, from NIH at least, allow faculty investigators' salaries to be paid from grant funds.  That means that in many health-science universities, the university itself is paying only a fraction, often tiny and perhaps sometimes none, of their faculty's salaries.  Faculty without salary-paying grants will be paid some fraction of their purported salaries and often for a limited time only.  And salaries generate overhead, so they're now well paid: higher pay, higher overhead for administrators!  Duh, a no-brainer!]

Universities should pay their faculty's salaries from their own resources.   Originally, grant reimbursement for faculty investigators' salaries were, in my understanding, paid on grants so the University could hire temporary faculty to do the PI's teaching and administrative obligations while s/he was doing the research.  Otherwise, if they're already paid to do research, what's the need? Faculty salaries paid on grants should only be allowed to be used in this way, not just as a source of cash.  Faculty should not be paid on soft money, because the need to hustle one's salary steadily is an obvious corrupting force on scientific originality and creativity. 

3.  Limit on how much external funding any faculty member or lab could have
There is far too much reward for empire-builders. Some do, or at least started out doing, really good work, but that's not always the case and diminishing returns for expanding cost is typical.  One consequence is that new faculty are getting reduced teaching and administrative duties so they can (must!) write grant applications. Research empires are typically too large to be effective and often have absentee PIs off hustling, and are under pressure to keep the factory running.  That understandably generates intense pressure to play it safe (though claiming to be innovative); but good science is not a predictable factory product. 

4.  A unified national health database
We need health care reform, and if we had a single national health database it would reduce medical costs and could be anonymized so research could be done, by any qualified person, without additional grants.  One can question the research value of such huge databases, as is true even of the current ad hoc database systems we pay for, but they would at least be cost-effective.

5. Temper the growth ethic 
We are over-producing PhDs, and this is largely to satisfy the game of the current faculty by which status is gained by large labs.  There are too many graduate students and post-docs for the long-term job market.  This is taking a heavy personal toll on aspiring scientists.  Meanwhile, there is inertia at the top, where we have been prevented from imposing mandatory retirement ages.  Amicably changing this system will be hard and will require creative thinking; but it won't be as cruel as the system we have now.

6. An end to deceptive publication characteristics  
We routinely see papers listing more authors than there are residents in the NY phone book.  This is pure careerism in our factory-production mode.  As once was the standard, every author should in principle be able to explain his/her paper on short notice.  I've heard 15 minutes. Those who helped on a paper such as by providing some DNA samples, should be acknowledged, but not listed as authors. Dividing papers into least-publishable-units isn't new, but with the proliferation of journals, it's out of hand.  Limiting CV lengths (and not including grants on them) when it comes to promotion and tenure could focus researchers' attention on doing what's really important rather than chaff-building.  Chairs and Deans would have to recognize this, and move away from safe but gameable bean-counting.  

FYI: We've moved towards judging people internally, and sometimes externally in grant applications, on the quantity of their publications rather than the quality, or on supposedly 'objective' (computer-tallied) citation counts.  This is play-it-safe bureaucracy and obviously encourages CV padding, which is reinforced by the proliferation of for-profit publishing.  Of course some people are both highly successful in the real scientific sense of making a major discovery, as well as in publishing their work.  But it is naive not to realize that many, often the big players grant-wise, manipulate any counting-based system.  For example, they can cite their own work in ways that increase the 'citation count' that Deans see.  Papers with very many authors also lead to red-claiming that is highly exaggerated relative to the actual scientific contribution.  Scientists quickly learn how to manipulate such 'objective' evaluation systems.] 

7.  No more too-big-and-too-long-to-kill projects
The Manhattan Project and many others taught us that if we propose huge, open-ended projects we can have funding for life.  That's what the 'omics era and other epidemiological projects reflect today.  But projects that are so big they become politically invulnerable rarely continue to deliver the goods.  Of course, the PIs, the founders and subsequent generations, naturally cry that stopping their important project after having invested so much money will be wasteful!  But it's not as wasteful as continuing to invest in diminishing returns.  Project duration should be limited and known to all from the beginning.

8.  A re-recognition that science addressing focal questions is the best science
Really good science is risky because serious new findings can't be ordered up like hamburgers at McD's.  We have to allow scientists to try things.  Most ideas won't go anywhere.  But we don't have to allow open-ended 'projects' to scale up interminably as has been the case in the 'Big Data' era, where despite often-forced claims and PR spin, most of those projects don't go very far, either, though by their size alone they generate a blizzard of results. 

9. Stopping rules need to be in place  
For many multi-year or large-scale projects, an honest assessment part-way through would show that the original question or hypothesis was wrong or won't be answered.  Such a project (and its funds) should have to be ended when it is clear that its promise will not be met.  It should be a credit to an investigator who acknowledges that an idea just isn't working out, and those who don't should be barred for some years from further federal funding.  This is not a radical new idea: it is precedented in the drug trial area, and we should do the same in research.  

It should be routine for universities to provide continuity funding for productive investigators so they don't have to cling to go-nowhere projects. Faculty investigators should always have an operating budget so that they can do research without an active external grant.  Right now, they have to piggy-back their next idea by using funds in their current grant, and without internal continuity funding, this is naturally leads to safe 'fundable'  projects, rather than really innovative ones.  The reality is that truly innovative projects typically are not funded, because it's easy for grant review panels to fault-find and move on the safer proposals.

10. Research funding should not be a university welfare program
Universities are important to society and need support.  Universities as well as scientists become entrenched.  It's natural.  But society deserves something for its funding generosity, and one of the facts of funding life could be that funds move.  Scientists shouldn't have a lock on funding any more than anybody else. Universities should be structured so they are not addicted to external funding on grants. Will this threaten jobs?  Most people in society have to deal with that, and scientists are generally very skilled people, so if one area of research shrinks others will expand.

11.  Rein in costly science publishing
Science publishing has become what one might call a greedy racket.  There are far too many journals, rushing out half-way reviewed papers for pay-as-you-go authors.  Papers are typically paid for on grant budgets (though one can ask how often young investigators shell out their own personal money to keep their careers).  Profiteering journals are proliferating to serve the CV-padding hyper-hasty bean-counting science industry that we have established.  Yet the vast majority of papers have basically no impact.  That money should go to actual research.

12.  Other ways to trim budgets without harming the science 
Budgets could be trimmed in many other ways, too:  no buying journal subscriptions on a grant (universities have subscriptions), less travel to meetings (we have Skype and Hangout!), shared costly equipment rather than a sequencer in every lab.  Grants should be smaller but of longer duration, so investigators can spend their time on research rather than hustling new grants. Junk the use of 'impact' factors and other bean-counting ways of judging faculty.  It had a point once--to reduce discrimination and be more objective, but it's long been strategized and manipulated, substituting quantity for quality.  Better evaluation means are needed.  

These suggestions are perhaps rather radical, but to the extent that they can somehow be implemented, it would have to be done humanely.  After all, people playing the game today are only doing what they were taught they must do.  Real reform is hard because science is now an entrenched part of society.  Nonetheless, a fair-minded (but determined!) phase-out of the abuses that have gradually developed would be good for science, and hence for the society that pays for it.

***NOTES:  As this was being edited, NY state has apparently just made its universities tuition-free for those whose families are not wealthy.  If true, what a step back towards sanity and public good!  The more states can get off the grant and other grant and strings-attached private donation hooks, the more independent they should be able to be.

Also, the Apr 12 Wall St Journal has a story (paywall, unless you search for it on Twitter) showing the faults of an over-stressed health research system, including some of the points made here.  The article points out problems of non-replicability and other technical mistakes that are characteristic of our heavily over-burdened system.  But it doesn't go after the System as such, the bureaucracy and wastefulness and the pressure for 'big data' studies rather than focused research, and the need to be hasty and 'productive' in order to survive.

Wednesday, March 29, 2017

The (bad) luck of the draw; more evidence

A while back, Vogelstein and Tomasetti (V-T) published a paper in Science in which it was argued that most cancers cannot be attributed to known environmental factors, but instead were due simply to the errors in DNA replication that occur throughout life when cells divide.  See our earlier 2-part series on this.

Essentially the argument is that knowledge of the approximate number of at-risk cell divisions per unit of age could account for the age-related pattern of increase in cancers of different organs, if one ignored some obviously environmental causes like smoking.  Cigarette smoke is a mutagen and if cancer is a mutagenic disease, as it certainly largely is, then that will account for the dose-related pattern of lung and oral cancers.

This got enraged responses from environmental epidemiologists whose careers are vested in the idea that if people would avoid carcinogens they'd reduce their cancer risk.  Of course, this is partly just the environmental epidemiologists' natural reaction to their ox being gored--threats to their grant largesse and so on.  But it is also true that environmental factors of various kinds, in addition to smoking, have been associated with cancer; some dietary components, viruses, sunlight, even diagnostic x-rays if done early and often enough, and other factors.

Most associated risks from agents like these are small, compared to smoking, but not zero and an at least legitimate objection to V-T's paper might be that the suggestion that environmental pollution, dietary excess, and so on don't matter when it comes to cancer is wrong.  I think V-T are saying no such thing.  Clearly some environmental exposures are mutagens and it would be a really hard-core reactionary to deny that mutations are unrelated to cancer.  Other external or lifestyle agents are mitogens; they stimulate cell division, and it would be silly not to think they could have a role in cancer.  If and when they do, it is not by causing mutations per se.  Instead mitogenic exposures in themselves just stimulate cell division, which is dangerous if the cell is already transformed into a cancer cell.  But it is also a way to increase cancer by just what V-T stress: the natural occurrence of mutations when cells divide.

There are a few who argue that cancer is due to transposable elements moving around and/or inserting into the genome where they can cause cells to misbehave, or other perhaps unknown factors such as of tissue organization, which can lead cells to 'misbehave', rather than mutations.

These alternatives are, currently, a rather minor cause of cancer.  In response to their critics, V-T have just published a new multi-national analysis that they suggest supports their theory.  They attempted to correct for the number of at-risk cells and so on, and found a convincing pattern that supports the intrinsic-mutation viewpoint.  They did this to rebut their critics.

This is at least in part an unnecessary food-fight.  When cells divide, DNA replication errors occur.  This seems well-documented (indeed, Vogelstein did some work years ago that showed evidence for somatic mutation--that is, DNA changes that are not inherited--and genomes of cancer cells compared to normal cells of the same individual.  Indeed, for decades this has been known in various levels of detail.  Of course, showing that this is causal rather than coincidental is a separate problem, because the fact of mutations occurring during cell division doesn't necessarily mean that the mutations are causal. However, for several cancers the repeated involvement of specific genes, and the demonstration of mutations in the same gene or genes in many different individuals, or of the same effect in experimental mice and so on, is persuasive evidence that mutational change is important in cancer.

The specifics of that importance are in a sense somewhat separate from the assertion that environmental epidemiologists are complaining about.  Unfortunately, to a great extent this is a silly debate. In essence, besides professional pride and careerism, the debate should not be about whether mutations are involved in cancer causation but whether specific environmental sources of mutation are identifiable and individually strong enough, as x-rays and tobacco smoke are, to be identified and avoided.  Smoking targets particular cells in the oral cavity and lungs.  But exposures that are more generic, but individually rare or not associated with a specific item like smoking, and can't be avoided, might raise the rate of somatic mutation generally.  Just having a body temperature may be one such factor, for example.

I would say that we are inevitably exposed to chemicals and so on that will potentially damage cells, mutation being one such effect.  V-T are substantially correct, from what the data look like, in saying that (in our words) namable, specific, and avoidable environmental mutations are not the major systematic, organ-targeting cause of cancer.  Vague and/or generic exposure to mutagens will lead to mutations more or less randomly among our cells (maybe, depending on the agent, differently depending on how deep in our bodies the cells are relative to the outside world or other means of exposure).  The more at-risk cells, the longer they're at risk, and so on, the greater the chance that some cell will experience a transforming set of changes.

Most of us probably inherit mutations in some of these genes from conception, and have to await other events to occur (whether these are mutational or of another nature as mentioned above).  The age patterns of cancers seem very convincingly to show that.  The real key factor here is the degree to which specific, identifiable, avoidable mutational agents can be identified.  It seems silly or, perhaps as likely, mere professional jealousy, to resist that idea.

These statements apply even if cancers are not all, or not entirely, due to mutational effects.  And, remember, not all of the mutations required to transform a cell need be of somatic origin.  Since cancer is mostly, and obviously, a multi-factor disease genetically (not a single mutation as a rule), we should not have our hackles raised if we find what seems obvious, that mutations are part of cell division, part of life.

There are curious things about cancer, such as our large body size but delayed onset ages relative to the occurrence of cancer in smaller, and younger animals like mice.  And different animals of different lifespans and body sizes, even different rodents, have different lifetime cancer risks (some may be the result of details of their inbreeding history or of inbreeding itself).  Mouse cancer rates increase with age and hence the number of at-risk cell divisions, but the overall risk at very young ages despite many fewer cell divisions (yet similar genome sizes) shows that even the spontaneous mutation idea of V-T has problems.  After all, elephants are huge and live very long lives; why don't they get cancer much earlier?

Overall, if if correct, V-T's view should not give too much comfort to our 'Precision' genomic medicine sloganeers, another aspect of budget protection, because the bad luck mutations are generally somatic, not germline, and hence not susceptible to Big Data epidemiology, genetic or otherwise, that depends on germ-line variation as the predictor.

Related to this are the numerous reports of changes in life expectancy among various segments of society and how they are changing based on behaviors, most recently, for example, the opiod epidemic among whites in depressed areas of the US.  Such environmental changes are not predictable specifically, not even in principle, and can't be built into genome-based Big Data, or the budget-promoting promises coming out of NIH about such 'precision'.  Even estimated lifetime cancer risks associated with mutations in clear-cut risk-affecting genes like BRCA1 mutations and breast cancer, vary greatly from population to population and study to study.  The V-T debate, and their obviously valid point, regardless of the details, is only part of the lifetime cancer risk story.

Just after posting this, I learned of a new story on this 'controversy' in The Atlantic.  It is really a silly debate, as noted in my original version.  It tacitly makes many different assumptions about whether this or that tinkering with our lifestyles will add to or reduce the risk of cancer and hence support the anti-V-T lobby.  If we're going to get into the nitty-gritty and typically very minor details about, for example, whether the statistical colon-cancer-protective effect of aspirin shows that V-T were wrong, then this really does smell of academic territory defense.

Why do I say that?  Because if we go down that road, we'll have to say that statins are cancer-causing, and so is exercise, and kidney transplants and who knows what else.  They cause cancer by allowing people to live longer, and accumulate more mutational damage to their cells.  And the supposedly serious opioid epidemic among Trump supporters actually is protective, because those people are dying earlier and not getting cancer!

The main point is that mutations are clearly involved in carcinogenesis, cell division life-history is clearly involved in carcinogenesis, environmental mutagens are clearly involved in carcinogenesis, and inherited mutations are clearly contributory to the additional effects of life-history events.  The silly extremism to which the objectors to V-T would take us would be to say that, obviously, if we avoided any interaction whatsoever with our environment, we'd never get cancer.  Of course, we'd all be so demented and immobilized with diverse organ-system failures that we wouldn't realize our good fortune in not getting cancer.

The story and much of the discussion on all sides is also rather naive even about the nature of cancer (and how many or of which mutations etc it takes to get cancer); but that's for another post sometime.

I'll add another new bit to my post, that I hadn't thought of when I wrote the original.  We have many ways to estimate mutation rates, in nature and in the laboratory.  They include parent-offspring comparison in genomewide sequencing samples, and there have been sperm-to-sperm comparisons.  I'm sure there are many other sets of data (see Michael Lynch in Trends in Genetics 2010 Aug; 26(8): 345–352.  These give a consistent picture and one can say, if one wants to, that the inherent mutation rate is due to identifiable environmental factors, but given the breadth of the data that's not much different than saying that mutations are 'in the air'.  There are even sex-specific differences.

The numerous mutation detection and repair mechanisms, built into genomes, adds to the idea that mutations are part of life, for example that they are not related to modern human lifestyles.  Of course, evolution depends on mutation, so it cannot and never has been reduced to zero--a species that couldn't change doesn't last.  Mutations occur in plants and animals and prokaryotes, in all environments and I believe, generally at rather similar species-specific rates.

If you want to argue that every mutation has an external (environmental) cause rather than an internal molecular one, that is merely saying there's no randomness in life or imperfection in molecular processes.  That is as much a philosophical as an empirical assertion (as perhaps any quantum physicist can tell you!).  The key, as  asserted in the post here, is that for the environmentalists' claim to make sense, to be a mutational cause in the meaningful sense, the force or factor must be systematic and identifiable and tissue-specific, and it must be shown how it gets to the internal tissue in question and not to other tissues on the way in, etc.

Given how difficult it has been to chase down most environmental carcinogenic factors, to which exposure is more than very rare, and that the search has been going on for a very long time, and only a few have been found that are, in themselves, clearly causal (ultraviolet radiation, Human Papilloma Virus, ionizing radiation, the ones mentioned in the post), whatever is left over must be very weak, non tissue-specific, rare, and the like.  Even radiation-induced lung cancer in uranium minors has been challenging to prove (for example, because miners also largely were smokers).

It is not much of a stretch to simply say that even if, in principle, all mutations in our body's lifetime were due to external exposures, and the relevant mutagens could be identified and shown in some convincing way to be specifically carcinogenic in specific tissues, in practice if not ultra-reality, then the aggregate exposures to such mutations are unavoidable and epistemically random with respect to tissue and gene.  That I would say is the essence of the V-T finding.

Quibbling about that aspect of carcinogenesis is for those who have already determined how many angels dance on the head of a pin.