I am using the `randomForest`

library in R via `RPy2`

. I would like to pass back the values calculated using the `caret`

`predict`

method and join them to the original `pandas`

dataframe. See example below.

```
import pandas as pd
import numpy as np
import rpy2.robjects as robjects
from rpy2.robjects import pandas2ri
pandas2ri.activate()
r = robjects.r
r.library("randomForest")
r.library("caret")
df = pd.DataFrame(data=np.random.rand(100, 10), columns=["a{}".format(i) for i in range(10)])
df["b"] = ['a' if x < 0.5 else 'b' for x in np.random.sample(size=100)]
train = df.ix[df.a0 < .75]
withheld = df.ix[df.a0 >= .75]
rf = r.randomForest(robjects.Formula('b ~ .'), data=train)
pr = r.predict(rf, withheld)
print pr.rx()
```

Which returns

```
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
a a b b b a a a a b a a a a a b a a a a
Levels: a b
```

But how can `join`

this to the `withheld`

dataframe or compare to the original values?

I have tried this:

```
import pandas.rpy.common as com
com.convert_robj(pr)
```

But this returns a dictionary where the keys are strings. I guess there is a work around of `withheld.reset_index()`

and then converting the dict keys to integers and then joining the two but there must be a simpler way!

# Best How To :

There is a pull-request that adds R factor to Pandas Categorical functionality to Pandas. It has not yet been merged into the Pandas master branch. When it is,

```
import pandas.rpy.common as rcom
rcom.convert_robj(pr)
```

will convert `pr`

to a Pandas Categorical. Until then, you can use as a workaround:

```
def convert_factor(obj):
"""
Taken from jseabold's PR: https://github.com/pydata/pandas/pull/9187
"""
ordered = r["is.ordered"](obj)[0]
categories = list(obj.levels)
codes = np.asarray(obj) - 1 # zero-based indexing
values = pd.Categorical.from_codes(codes, categories=categories,
ordered=ordered)
return values
```

For example,

```
import pandas as pd
import numpy as np
import rpy2.robjects as robjects
from rpy2.robjects import pandas2ri
pandas2ri.activate()
r = robjects.r
r.library("randomForest")
r.library("caret")
def convert_factor(obj):
"""
Taken from jseabold's PR: https://github.com/pydata/pandas/pull/9187
"""
ordered = r["is.ordered"](obj)[0]
categories = list(obj.levels)
codes = np.asarray(obj) - 1 # zero-based indexing
values = pd.Categorical.from_codes(codes, categories=categories,
ordered=ordered)
return values
df = pd.DataFrame(data=np.random.rand(100, 10),
columns=["a{}".format(i) for i in range(10)])
df["b"] = ['a' if x < 0.5 else 'b' for x in np.random.sample(size=100)]
train = df.ix[df.a0 < .75]
withheld = df.ix[df.a0 >= .75]
rf = r.randomForest(robjects.Formula('b ~ .'), data=train)
pr = convert_factor(r.predict(rf, withheld))
withheld['pr'] = pr
print(withheld)
```