This is an interesting problem, but your example has too much of other things going on. I prefer to solve this in isolation, using my own example. Consider the following input: XML <book> <chapter id="A"> <para> <sentence id="1" length="23">Mary had a little lamb,</sentence> <sentence id="2" length="29">His fleece was white as...
The approach is to grap the data for each chapter one at a time. Then combine the words of one chapter together with the chapter number into a data frame. R will repeat the single value for the chapter number as often as needed: words <- letters[1:3] n <- 1...
Added additional xpathApply for sp elements: bodyToDF <- function(x){ scenenum <- xmlGetAttr(x, "n") scenetype <- xmlGetAttr(x, "type") sp <- xpathApply(x, 'sp', function(sp) { who <- xmlGetAttr(sp, "who") if(is.null(who)) who <- NA line_num <- xpathSApply(sp, 'l', function(l) { xmlGetAttr(l,"n")}) linecontent = xpathSApply(sp, 'l', function(l) { xmlValue(l,"n")}) data.frame( scenenum, scenetype, who, line_num,...
The following transform will give the desired output. Note that it makes a few assumptions about how the content is structured. In particular, how do you know when a p is a footnote? It is structurally the same as other paragraphs. The code below uses the identifier naming scheme, which...
The teiHeader does not appear because of this template: <xsl:template match="teiHeader"> <xsl:apply-templates/> </xsl:template> You are matching teiHeader, but once matched you are not copying it, but instead passing on control to its child nodes, resulting in no teiHeader being written in the output. Now, you could simply remove this template,...
I think all you need is some code as follows (you will need to adapt the templates to match the elements in your namespace but as your sample does not show a namespace I have worked without one): <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="html" indent="yes"/> <xsl:template match="surface"> <xsl:apply-templates select=".//orig[note]" mode="list"/> </xsl:template>...