Mendenhall's Graphs Revisited

One method of measuring style that achieved some notoriety in the authorship stakes (mainly because of the misrepresentation of his findings by others) was devised by an American physicist, Dr. T.C. Mendenhall, back at the end of the last century. Mendenhall suggested that authors' styles might be 'fingerprinted' by counting the numbers of letters in the words they used. He demonstrated that different authors tended to display different profiles, which he illustrated by means of a graph showing how many one-letter words, how many two-letter words, how many three-letter words, etc. they tended to use. (1)

Hearing of this, a strong supporter of the case for Francis Bacon as the author of Shakespeare's works sought Mendenhall's help. Unfortunately for him, Mendenhall found the profiles of Bacon and Shakespeare to be quite different. As a part of his control group, however, Mendenhall also asked his team of 'word counters' to calculate a profile for Marlowe and, as the story goes, "something akin to a sensation was produced among those engaged in the work". "In the characteristic curve of his plays Marlowe agrees with Shakespeare about as well as Shakespeare agrees with himself". What he did not realize, however, was that authors may vary over time and between genres. (2) It is true nevertheless that the pattern for Marlowe's later plays correlates (at an astonishing r = .9998, where r = 1 is perfect correlation) more closely with Shakespeare's histories, tragedies and 'Roman' plays than Shakespeare's own comedies do with them (.9986), or than Marlowe's earlier plays do with his later ones (.9943).

I have analysed works by thirty-six famous writers in English, covering well over five million words in all. This shows not only that Mendenhall was right to predict a 'characteristic curve' for each writer, provided that like is matched with like, (Appendix III shows just a few examples), but that not one of the six hundred and sixty-six possible pairings of authors yields a correlation higher than the Marlowe/Shakespeare one given above (Appendix IV).

Function words

In Wells & Taylor's Textual Companion (3) to their Oxford Complete Works, Prof. Gary Taylor examined the frequency with which ten 'function words' (but, by, for, no, not, so, that, the, to and with) are used in each work, and compared Shakespeare's 'normal' usage with that of other writers. He asserted that by doing so he could identify authorship, and confidently stated that the Marlowe of Tamburlaine could not have possibly written Shakespeare's plays.

Quite apart from the invalidity of his statistical argument for reaching this conclusion, (4) he failed to take the probable date of composition into account - i.e. to see possible trends. When this is done, his own figures - both the word frequency and the dates - can be used to support the case for Christopher Marlowe as 'Shakespeare'. Shakespeare's relative use of some of Taylor's function words, such as for and with, tends to decrease over time, whereas with others, such as no and to, it tends to increase. All I have done is to take the totals for each of what Taylor calls Shakespeare's 'core canon' works, divide the number of these 'decreasing' words by the number of 'increasing' ones in each case, and plot them against a timescale. This is shown as Appendix V, (5) in which the trend shows up very clearly.

In this, and all of the following appendices, the polynomial trendline calculated by the spreadsheet is based only on the Shakespeare items. In fact, wherever a trend can be established for Shakespeare, Marlowe is always to be found where one might have expected a young Shakespeare to be, had he written anything at all before his late twenties.

In an attempt to get a better trend for Shakespeare, still using function words, I increased the number of words to 50, with 25 'decreasing' words divided by 25 'increasing'. Appendix VI illustrates the result, which is even more revealing. (6)

Run-on lines and feminine endings

The same applies wherever an attempt is made to discriminate between the works of Marlowe and Shakespeare. For example, we know that Shakespeare made use of far more run-on lines and feminine endings than Marlowe. Briefly, the 'run-on line' is fairly obvious, being best described as those occasions when any form of punctuation at the end of the line would result in nonsense. For example: Prospero's famous lines in Act IV scene 1 of The Tempest:

Our revels now are ended: these our actors,
As I foretold you, were all spirits, and
Are melted into air, into thin air.
It would simply not be possible to put any sort of punctuation at the end of the second line without wrecking the sense. That's a run-on line, or 'enjambment'. And a 'feminine ending'? Well, the usual rhythm of Marlowe's and of Shakespeare's blank verse is the iambic (di-dum) pentameter (repeated 5 times). You see this illustrated in lines two and three above. Not in line one, however, which seems to have acquired an extra "di" at the end. That's a feminine ending. For each work, I counted up how many times these two techniques were used, and expressed the total as a percentage of the number of lines of verse in each case.

Plotted against the same time-scale, rather than using the overall figure for a straight comparison between authors, a perfectly smooth curve can be seen to pass through the two groups of plays. (Appendix VII) (7)


Similarly, it has been noticed that even the relative frequency with which individual letters are employed may be used to discriminate between authors! It may seem astonishing, but there are some letters which Shakespeare's words used less as time went by, and some that he used more. I simply divided one group by the other for each of the works of both Shakespeare and Marlowe. Once again, plotting these results against the probable date of the work in question shows a smooth transition from Marlowe to Shakespeare. (Appendix VIII) (8)

It is perhaps also worth mentioning that, if the dates for Two Gentlemen of Verona and The Taming of the Shrew are shifted beyond the killing in May 1593, to when Marlowe may well have actually been in Italy, a better trendline (9) is obtained for all four graphs.


It has long been recognized that Christopher Marlowe created the dramatic form of blank verse that Shakespeare was to develop so majestically, and which we indeed now think of as 'Shakespearean'. So well did Shakespeare apparently learn from Marlowe's model that there are still plays (e.g. Edward III) where opinion is divided as to whether they are by Marlowe or by Shakespeare. Bakeless (10) went into the similarities between Marlowe and the early Shakespeare in considerable detail, and pointed out that there are countless parallelisms both within and between the works of the two authors. These show the poet, probably unconsciously, coming up with similar words or images to describe a similar event or state of mind. An example familiar to most people would be those lines which open the 'balcony scene' in Romeo and Juliet (Act II Scene 1), compared here with the words of Barabas in Marlowe's Jew of Malta (Act II Scene 1), when Abigail appears, also on a balcony.

But, soft! what light through yonder window breaks?But stay! what star shines yonder in the east?
It is the East, and Juliet is the sun! The lodestar of my life, if Abigail!
Here are just a few more (Marlowe first):

Then from the navel to the throat at once Till he unseamed him from the nave to th' chops,
He ripped old Priam, And fixed his head upon our battlements.
(Dido, Act II Scene 1) (Macbeth, Act I Scene 2)
Nature doth strive with fortune and his stars But thou art fair, and at thy birth, dear boy,
To make him famous in accomplished worth. Nature and fortune joined to make thee great.
(1 Tamburlaine, Act II Scene 1) (King John, Act III Scene 1)
Flying dragons, lightning, fearful thunderclaps, Vaunt couriers of oak-cleaving thunderbolts,
Singe these fair plains... Singe my white head!
(2 Tamburlaine, Act III Scene 2) (King Lear, Act III Scene 2)

Rightly, such similarities have never been considered sufficient to determine authorship, but I thought that it might nevertheless be interesting to take a single speech by Marlowe and see the extent to which each line is reflected in the works of Shakespeare. Here is the result.

Marlowe's Edward II (V.i.51-83) Shakespeare
Ah, Leicester, weigh how hardly I can brookKnowing how hardly I can brook abuse(2H6)
To lose my crown and kingdom without cause,Your crown and kingdom, indirectly held(H5)
To give ambitious Mortimer my right,Ambitious York did level at thy crown,(3H6)
That like a mountain overwhelms my bliss;o'erwhelm it / As fearfully as doth a galled rock(H5)
In which extreme my mind here murdered is.O what a noble mind is here o'erthrown(Ham)
But what the heavens appoint, I must obey.No more obey the heavens than our courtiers(Cym)
Here, take my crown, the life of Edward too:There is a plot against my life, my crown(WT)
Two kings in England cannot reign at once.Nor can one England brook a double reign(1H4)
But stay a while, let me be king till night,But stay awhile - what company is this?(Shrew)
That I may gaze upon this glittering crown;I'll make my heaven to dream upon the crown;(3H6)
So shall my eyes receive their last content,Nor to be seen: my crown is called content.(3H6)
My head, the latest honour due to it,This is the latest glory of thy praise,(1H6)
And jointly both yield up their wished right.Tomorrow yield up rule, resign my life,(Titus)
Continue ever, thou celestial sun;But now I worship a celestial sun.(2GV)
Let never silent night possess this clime:Let never day nor night unhallowed pass,(2H6)
Stand still, you watches of the element;A silence in the heavens, the rack stand still,(Ham)
All times and seasons, rest you at a stay,Make glad and sorry seasons as thou fleet'st,
And do whate'er thou wilt, swift-footed Time

That Edward may be still fair England's king.Arm, fight, and conquer for fair England's sake.(R3)
But day's bright beams doth vanish fast away,But to the brightest beams/Distracted clouds give way(AW)
And needs I must resign my wished crown.Nay, then I see that Edward needs must down.(3H6)
Inhuman creatures, nursed with tiger's milk,...more inhuman...than tigers of Hyrcania.
...no more mercy in him than... milk in a male tiger
Why gape you for your sovereign's overthrow?That plotted thus our glory's overthrow(1H6)
My diadem, I mean, and guiltless life.And all to make away my guiltless life.(2H6)
See, monsters, see! I'll wear my crown again.See, brother, see! Note how she quotes the leaves.(Titus)
What, fear you not the fury of your king?Then, till the fury of his highness settle,(WT)
But, hapless Edward, thou art fondly led;Hapless Egeon, whom the fates have marked(CE)
They pass not for thy frowns as late they did,Fear no more the frown o'th' great,(Cym)
But seek to make a new elected king;But seek revenge on Edward's mockery.(3H6)
Which fills my mind with strange despairing thoughts,And manage it against despairing thoughts.(2GV)
Which thoughts are martyred with endless torments.Immodestly lies martyred with disgrace(RoL)
And in this torment comfort find I none,And from that torment I will free myself,(3H6)
But that I feel the crown upon my head;And set a precious crown upon thy head(1H6)
And therefore let me wear it yet a while.And therefore let me have him home with me.(CE)

On its own this proves nothing, of course, but given all that has gone before..?

back to Title Page           back to previous section (Section 7)           to next section (Section 9)


(1) T. C. Mendenhall, 'The characteristic curves of composition' (1887) in Science, Vol.IX no.214 (supplement) pp.237-249. And 'A mechanical solution of a literary problem' (1901) in The Popular Science Monthly, Vol.LX no.7, pp.97-105. Critics please note: it was the profile and specifically not the average that he suggested using, and he never claimed to have proved anything.

(2) Which in fact means that not much can be made of his comparison of Bacon and Shakespeare. Whereas Mendenhall was restricted to using large samples of the authors' works, and had to employ teams of women to do the counting, we are much more fortunate. Machine-readable versions of the works are fairly easily obtainable and the word processor used for this paper (Microsoft Word) is quite capable of doing all of the counting required. For most of the following I have used either the text included in the CD-ROM 'Classic Library' from Andromeda, the Internet, or files provided for me by the Oxford University Computing Service. I prefer to use versions in modern spelling, as it is the choice of words rather than the way they are spelled that is being compared. I remove the apostrophes (so that 's doesn't count as a one-letter word, for example) and for plays I also take out the Dramatis Personae and the speech headings.

(3) Gary Taylor, 'The canon and chronology of Shakespeare's plays', pp.81-2, in Stanley Wells and Gary Taylor, (et al): William Shakespeare, a textual companion (1987), Clarendon Press, Oxford.

(4) Without going into the detail of this, which would take a page or two, I will simply say that it is something about which I have exchanged correspondence with Prof. Taylor, and that he seemed to accept my criticism of his method.

(5) The trendline, calculated automatically by the spreadsheet, is a polynomial (quadratic) curve based only on the Shakespeare data. As a measure of the validity of the trend, it also calculated an R2 value of .3964. This (called the coefficient of determination) tells us that nearly 40% of the variation about the mean (.665) is 'explained' by the trend. Only after calculating and plotting the Shakespeare data were the Marlowe figures added, using dates suggested by C. F. Tucker Brooke in The Works of Christopher Marlowe (1910), Clarendon Press, Oxford.

(6) It worked, as I was able thereby to obtain an R2 value of .68 (i.e. 68% of the 'Shakespeare' variation is now explained by the trend). Note what an excellent 'discriminator' between Marlowe and Shakespeare this would have made. If A exceeds B, it is Marlowe; but if B exceeds A, it is Shakespeare, and the only exceptions (2H6, 3H6 and Titus) said by many to have been by Marlowe anyway.

(7) The R2 value for Shakespeare has this time gone up to an impressive .8742.

(8) I am indebted to Dr. Tom Merriam for pointing this out (although he certainly does not share my interpretation of its significance). I include the poems this time to show the surprising juxtaposition of Hero & Leander and Venus & Adonis. The R2 value is not quite so high this time (.22), mainly because of the effect of the poems and of the obvious 'upturn' in the later years. Again the letters chosen are those providing the best trend.

(9) i.e. a higher value for R2.

(10) Bakeless, op. cit.

back to Title Page           back to previous section (Section 7)           to next section (Section 9)