Rulia (github): une vie polyamoureuse entre R et Julia
R(ulia) Drouilhet
https://cqls.dyndoc.fr/Rulia/talk
►Introduction
►Why I love R
and julia
?
R
: (single) dispatching, keyword arguments, simplicity, metaprogramming, unique package managerjulia
: almost same thing + performance (multiple dispatching, J.I.T. compilation,tree of Types)
R
language (my first love π)- (single) dispatching (S3): as an alternative (equivalent) to OOP
- keyword arguments: named parameters with default values equivalent to several functions within unique function definition
- simplicity: few data structures and concept of attributes (metadata)
- metaprogramming: expressiveness, LISP-like programming (code is data)
- package manager: unique system
julia
language: worthy successor of R
, an UNIQUE language hybrid of R
/C++
- the tryptic:
- multiple dispatching : action differing among signature (not only the first argument type)
- Just In Time (J.I.T.) compilation
- tree of Types (any variable has a type, even a type)
- performance: recovery (direct and indirect) of the for loop (
C++
-like) - expressiveness:
- metaprogramming via macro system
- keyword arguments (not considered in the signature)
- package manager: unique system (based on a package itself:
Pkg.jl
)
►Why the package Rulia
(julia
for R
)?
- an evidence (for me):
julia
an UNIQUE language combining expressiviness ofR
and performance ofC++
Rulia
(alternative ofJuliaCall
) for creation ofR
package to "wrap"julia
package without an dependence toRcpp
andRCall.jl
(unlikeJuliaCall
)
History of
Rulia
jl4R
(the previous name) started about 10 years ago (whenJuliaCall
is 7 years old), conversions restricted to communication betweenR vector
andjulia Array
- when was the start of "replacing
Rcpp
withRulia
" :-
creation of
VirtualAgeModels.jl
(julia
package) inspired fromVAM
(R
package based onRcpp
) -
an evidence:
julia
an UNIQUE language combining expressiviness ofR
and performance ofC++
-
creation of
- intensification of the development of
Rulia
:-
for creation of
R
package to "wrap"julia
package -
no dependencies on
Rcpp
andRCall.jl
(unlikeJuliaCall
) -
R
user of the "wrapper" has no need to knowjulia
(as forRcpp
)
-
for creation of
Final goal of
Rulia
: VirtualAgeModelsJL
simulationjulia> using VirtualAgeModels julia> m_sim = @vam(Time & Type ~ (ARAβ(0.4) | Weibull(.001,2.5))); julia> params(m_sim) 3-element Vector{Float64}: 0.001 2.5 0.4 julia> df = rand(m_sim,100); julia> first(df,3) 3Γ2 DataFrame Row β Time Type β Float64 Int64 ββββββΌββββββββββββββββ 1 β 12.3536 -1 2 β 14.778 -1 3 β 30.3098 -1
R> jl_set.seed(13) R> require(VirtualAgeModelsJL) R> m_sim = vam(Time & Type ~ (ARAInf(.4) | Weibull(.001,2.5))) R> params(m_sim) [1] 0.001 2.500 0.400 R> df = rand(m_sim,100) R> head(df,3) Time Type 1 20.43478 -1 2 21.53152 -1 3 26.12670 -1
Final goal of
Rulia
: VirtualAgeModelsJL
estimation MLEjulia> using VirtualAgeModels julia> m_mle = @vam(Time & Type ~ (ARAβ(0.4) | Weibull(.001,2.5))); julia> mle(m_mle, df); julia> params(m_mle) 3-element Vector{Float64}: 0.001 2.207031809102441 0.5353646517580017
R> require(VirtualAgeModelsJL) R> m_mle = vam(Time & Type ~ (ARAInf(.4) | Weibull(.001,2.5))) R> mle(m_mle, data = df) [1] 0.0010000 2.5754666 0.3955855 R> params(m_mle) [1] 0.0010000 2.5754666 0.3955855
►Rulia
(julia
for R
)
►Use of Rulia
- Workflow
julia
function(s) creationjulia
function(s) call(s) withinR
- 2 conversions required:
R
tojulia
to fulfill argumentsjulia
toR
to use the results withinR
Use of
Rulia
- Worflow step 1:
julia
function(s) creation:-
julia
package creation (then inR
, _`jlusing(_package_)`_) -
julia
file creation (then inR
, _`jlinclude(_file.jl_)`_) -
"inline"
julia
function creation (likeRcpp
inline mode) directly withinR
(then inR
, _`jl(_func_)`_)
-
- Worflow step 2:
julia
function(s) call(s) withinR
-
conversion
R
tojulia
via _`jl(_R_object_)`_ (in fact _`jlvalue(_R_object_)`_ -
julia
function call withinR
(via _`jl(_func_)(...)`_) -
conversion
julia
toR
(via _`R(_jlvalue_object_)`_)
-
conversion
3 user modes for
Rulia
jl
mode ("safe" and user-friendly) :-
only based on an unique
jl()
function -
goals:
-
coding within
R
as close as possible tojulia
code -
use of backtick ("`") to enter
julia
content -
if possible, only one
jl()
function to express ajulia
call
-
coding within
-
only based on an unique
jleval
mode ("safe") :-
"safe" by using a system of exception (
try/catch
) -
main functions:
jleval()
etjlcall()
-
"safe" by using a system of exception (
jlvalue
mode ("unsafe"):-
conversions with
jlvalue()
and _`jlvalue_...()`_ function call: -
not "safe" for function call since use of
API C
ofR
andjulia
- developer use only (you have to know that the action is correct)
-
conversions with
1. Code
julia
(mode jl
)julia> [1,3,4] 3-element Vector{Int64}: 1 3 4 julia> VERSION v"1.9.3"
R> jl(`[1,3,4]`) 3-element Vector{Int64}: 1 3 4 R> jl(VERSION) # ou jl(`VERSION`) v"1.9.3"
1. Code
julia
(mode jleval
)julia> [1,3,4] 3-element Vector{Int64}: 1 3 4 julia> VERSION v"1.9.3"
R> jleval("[1,3,4]") 3-element Vector{Int64}: 1 3 4 R> jleval("VERSION") v"1.9.3"
1. Code
julia
(mode jlvalue
)julia> [1,3,4] 3-element Vector{Int64}: 1 3 4 julia> VERSION v"1.9.3"
R> jlvalue_eval("[1,3,4]") 3-element Vector{Int64}: 1 3 4 R> jlvalue_eval("VERSION") v"1.9.3"
2. Code
julia
multilines (mode jl
)julia> f(x,y) = x + y f (generic function with 1 method) julia> (f(2,3), f(1.0,3)) (5, 4.0)
R> jl(` + f(x,y) = x + y + (f(2,3), f(1.0,3)) + `) (5, 4.0)
2. Code
julia
multilines (mode jleval
)julia> f(x,y) = x + y f (generic function with 1 method) julia> (f(2,3), f(1.0,3)) (5, 4.0)
R> jleval(" + f(x,y) = x + y + (f(2,3), f(1.0,3)) + ") (5, 4.0)
2. Code
julia
multilines (mode jlvalue
)julia> f(x,y) = x + y f (generic function with 1 method) julia> (f(2,3), f(1.0,3)) (5, 4.0)
R> jlvalue_eval(" + f(x,y) = x + y + (f(2,3), f(1.0,3)) + ") (5, 4.0)
3. Conversion (basic)
R
to julia
(mode jl
)julia> t = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof.(t) (Bool, Int64, Float64, String)
R> jl(TRUE) true R> jl(1L) 1 R> jl(1) 1.0 R> jl("1.0") "1.0"
3. Conversion (basic)
R
to julia
(mode jlvalue
)julia> t = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof.(t) (Bool, Int64, Float64, String)
R> jlvalue(TRUE) true R> jlvalue(1L) 1 R> jlvalue(1) 1.0 R> jlvalue("1.0") "1.0"
4. Conversion (vector)
R
to julia
(mode jl
)julia> a = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof(a) Tuple{Bool, Int64, Float64, String}
R> jl(c(TRUE, 1L, 1, "1.0")) 4-element Vector{String}: "TRUE" "1" "1" "1.0" R> jl(list(TRUE, 1L, 1, "1.0")) (true, 1, 1.0, "1.0")
4. Conversion (vector)
R
to julia
(mode jlvalue
)julia> a = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof(a) Tuple{Bool, Int64, Float64, String}
R> jlvalue(c(TRUE, 1L, 1, "1.0")) 4-element Vector{String}: "TRUE" "1" "1" "1.0" R> jlvalue(list(TRUE, 1L, 1, "1.0")) (true, 1, 1.0, "1.0")
5.
julia
variables within R
(mode jl
)julia> a = [true, 1, 1.0, "1.0"] 4-element Vector{Any}: true 1 1.0 "1.0" julia> typeof(a) Vector{Any} (alias for Array{Any, 1}) julia> b = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof(b) Tuple{Bool, Int64, Float64, String}
R> ## Native julia objects R> jl(a = `[true, 1, 1.0, "1.0"]`) R> jl(a) 4-element Vector{Any}: true 1 1.0 "1.0" R> jl()$a 4-element Vector{Any}: true 1 1.0 "1.0" R> jl(b = `(true, 1, 1.0, "1.0")`) R> jl(b) (true, 1, 1.0, "1.0") R> jl()$b (true, 1, 1.0, "1.0") R> ## with conversion R -> julia R> jl(b2 = list(TRUE, 1L, 1, "1.0")) R> jl(b2) (true, 1, 1.0, "1.0")
5.
julia
variables within R
(mode jleval
)julia> a = [true, 1, 1.0, "1.0"] 4-element Vector{Any}: true 1 1.0 "1.0" julia> typeof(a) Vector{Any} (alias for Array{Any, 1}) julia> b = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof(b) Tuple{Bool, Int64, Float64, String}
R> jleval('a =[true, 1, 1.0, "1.0"]') 4-element Vector{Any}: true 1 1.0 "1.0" R> jleval('a') 4-element Vector{Any}: true 1 1.0 "1.0" R> jleval('b = (true, 1, 1.0, "1.0")') (true, 1, 1.0, "1.0") R> jleval('b') (true, 1, 1.0, "1.0") R> ## error below don't crash R> jleval('b = (true, 1, 1.0, "1.0"') Julia Exception: ErrorException
5.
julia
variables within R
(mode jlvalue
)julia> a = [true, 1, 1.0, "1.0"] 4-element Vector{Any}: true 1 1.0 "1.0" julia> typeof(a) Vector{Any} (alias for Array{Any, 1}) julia> b = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof(b) Tuple{Bool, Int64, Float64, String}
R> jlvalue_eval('a =[true, 1, 1.0, "1.0"]') 4-element Vector{Any}: true 1 1.0 "1.0" R> jlvalue_eval('a') 4-element Vector{Any}: true 1 1.0 "1.0" R> jlvalue_eval('b = (true, 1, 1.0, "1.0")') (true, 1, 1.0, "1.0") R> jlvalue_eval('b') (true, 1, 1.0, "1.0") R> ## error below will crash badly R> # jlvalue_eval('b = (true, 1, 1.0, "1.0"')
6.
julia
function call within R
(mode jl
)julia> sum sum (generic function with 10 methods) julia> typeof(sum) typeof(sum) (singleton type of function sum, subtype of Function) julia> sum([1,3,2]) 6 julia> sum([1,3,2], init=4) 10 julia> isa(sum,Function) true julia> sum isa Function true
R> jl(sum) sum (generic function with 10 methods) R> class(jl(sum)) [1] "typeof(sum)" "jlfunction" R> jl(typeof)(sum) typeof(sum) (singleton type of function sum, subtype of Function) R> jl(sum)(`[1,3,2]`) 6 R> jl(sum)(c(1,3,2)) 6.0 R> jl(sum)(c(1,3,2), init=4) 10.0 R> jl(isa)( `sum`,`Function`) true R> jl(isa)(sum,Function) true
6.
julia
function call within R
(mode jleval
)julia> sum sum (generic function with 10 methods) julia> typeof(sum) typeof(sum) (singleton type of function sum, subtype of Function) julia> sum([1,3,2]) 6 julia> sum([1,3,2], init=4) 10 julia> isa(sum,Function) true julia> sum isa Function true
R> jleval("sum") sum (generic function with 10 methods) R> jleval("typeof(sum)") typeof(sum) (singleton type of function sum, subtype of Function) R> jlcall("sum", jleval("[1,3,2]")) 6 R> jlcall("sum", c(1,3,2), init=4) 10.0 R> jlcall("isa", jleval("sum"),jleval("Function")) true R> jleval("sum isa Function") true
6.
julia
function call within R
(mode jlvalue
)julia> sum sum (generic function with 10 methods) julia> typeof(sum) typeof(sum) (singleton type of function sum, subtype of Function) julia> sum([1,3,2]) 6 julia> sum([1,3,2], init=4) 10 julia> isa(sum,Function) true julia> sum isa Function true
R> jlvalue_eval("sum") sum (generic function with 10 methods) R> jlvalue_eval("typeof(sum)") typeof(sum) (singleton type of function sum, subtype of Function) R> jlvalue_call("sum",jlvalue_eval("[1,3,2]")) 6 R> ## Not possible: jlvalue_call("sum", jlvalue([1,3,2]), init=4)") R> jlvalue_eval("isa(sum,Function)") true R> jlvalue_eval("sum isa Function") true
7. Conversion
julia
to R
R> jl(a = `[1,3,2]`) R> R(jl(a)) [1] 1 3 2 R> jl(sum)(a) 6 R> jl(sum)(a) 6 R> R(jl(sum)(a)) [1] 6 R> R(jl(sum)(c(1,3,2), init=4)) [1] 10 R> R(jlcall("sum", c(1,3,2), init=4)) [1] 10
►Rulia
: jl
mode more in details
R
metaprogramming allows us to offer a format very similar to originaljulia
code- no string to call
julia
function
1. Mode
Règle 1: jl+
: fonctionnement de jl()
jl(`...`)
: if argument of jl()
is a name
(between backticks) then julia
code execution and the returned result if of R
class jlvalue
R> jl(`10`) 10 R> jl(`'1'`) '1': ASCII/Unicode U+0031 (category Nd: Number, decimal digit) R> jl(`"1"`) "1" R> jl(`(a=[1,"toto", '1'], b=12//13)`) (a = Any[1, "toto", '1'], b = 12//13) R> class(jl(`'1'`)) [1] "Char" "jlvalue" R> class(jl(`"1"`)) [1] "String" "Struct" "jlvalue"
1. Mode
Règle 2: if argument of jl+
: fonctionnement de jl()
jl()
is already a jlvalue
, the returned resukt is itself
R> jl(`10`) 10 R> jl(jl(`10`)) 10 R> jlval10 <- jl(`10`) R> class(jlval10) [1] "Int64" "jlvalue" R> jl(jlval10) 10 R> R(jlval10) [1] 10
1. Mode
Règle 3: if argument of jl+
: fonctionnement de jl()
jl()
is an R
object, it is converted within julia
and an jlvalue
external pointer is returned (thanks to a redirection to jlvalue()
)
R> c(jl(10), jl(10L), jl("10"), jl(TRUE)) ## ou c(jlvalue(10), jlvalue(10L), jlvalue("10"), jlvalue(TRUE)) [[1]] 10.0 [[2]] 10 [[3]] "10" [[4]] true
1. Mode
Règle 4: jl+
: fonctionnement de jl()
R
variable comme argument de jl()
higher priority than a julia
variable with the same name
R> jl(a = 10) R> a <- 12 R> jl(a) 12.0 R> jl()$a 10.0 R> jl(b = 11) R> b Erreur dans evalq({ : objet 'b' introuvable R> jl(b) 11.0 R> jl()$b 11.0
1. Mode
Règle 5: if argument of jl+
: fonctionnement de jl()
jl()
is a julia
expression (content without backticks), julia
object is returned as a jlvalue
(in fact in R
, `abc`
and abc
is the same, so jl(sin)
is the same than jl(`sin`)
) )
R> jl(sin) sin (generic function with 14 methods) R> jl(Array) Array R> jl(Vector) Vector (alias for Array{T, 1} where T) R> jl(jl(Vector)) Vector (alias for Array{T, 1} where T) R> class(jl(Vector)) [1] "UnionAll" "jlfunction" R> ## jl(Vector{Float64}) Not possible since Vector{Float64} is not an R expression R> jl(`Vector{Float64}`) Vector{Float64} (alias for Array{Float64, 1})
1. Mode
Règle 6:
jl+
: fonctionnement de jl()
-
if
jl()
returns ajulia
function, it is then converted to ajlfunction
that can be executed withinR
-
when calling a
jlfunction
, in afriendly manner,jl()
is applied implicitly to arguments.
R> jl(sin) sin (generic function with 14 methods) R> class(jl(sin)) [1] "typeof(sin)" "jlfunction" R> jl(sin)(13) # conversion R -> julia 0.4201670368266409 R> jl(sin)(`13`) # no conversion 0.4201670368266409 R> jl(`sqrt β +`) sqrt β (+) R> class(jl(`sqrt β +`)) [1] "ComposedFunction{typeof(sqrt), typeof(+)}" "jlfunction" R> jl(`sqrt β +`)(3, 6) 3.0
1. Mode
Règle 7: if argument of jl+
: fonctionnement de jl()
jl()
is a name
with bad julia
expression, a jlexception
is returned
R> jl(sin)(`[1,2]`) Julia Exception: MethodError R> class(jl(sin)(`[1,2]`)) [1] "MethodError" "jlexception" "jlvalue"
2. Broadcasting with
Rulia
julia> sin.([1,4,2]) 3-element Vector{Float64}: 0.8414709848078965 -0.7568024953079282 0.9092974268256817 julia> broadcast(sin, [1,4,2]) 3-element Vector{Float64}: 0.8414709848078965 -0.7568024953079282 0.9092974268256817
R> jl(broadcast)(sin, c(1,4,2)) 3-element Vector{Float64}: 0.8414709848078965 -0.7568024953079282 0.9092974268256817
3. Splat with
Rulia
julia> Ξ£(x...) = sum(x) Ξ£ (generic function with 1 method) julia> Ξ£(1,3,2,4.0), Ξ£(1,3,2) (10.0, 6) julia> julia> somme = splat(+) splat(+) julia> somme([1,4,2.0]) 7.0
R> ## DΓ©finition inside julia R> jl(`Ξ£(x...) = sum(x)`) Ξ£ (generic function with 1 method) R> jl(Ξ£)(1,3,2,4.0) 10.0 R> jl(Ξ£)(1,3,2) 6.0 R> ## Directly in R R> Ξ£2 <- jl(splat)(`+`) R> Ξ£2(c(1,4,2)) 7.0 R> jl(somme = jl(splat)(`+`) ) R> jl(somme) # saved inside R splat(+) R> jl(somme)(c(1,4,2)) 7.0
4. Operators
isa
and <:
with Rulia
julia> isa(1, Number) true julia> isa(1.0, Real) true julia> 1.0 isa Real true julia> typeof(1.0) Float64 julia> <:(typeof(1.0),Real) true julia> typeof(1.0) <: Real true
R> jl(isa)(1, Number) # 1 is converted to 1.0 true R> jl(`<:`)(jl(typeof)(1.0), Real) true R> "%isa%" <- function(a, b) jl(isa)(substitute(a),substitute(b)) R> "%<:%" <- function(a, b) jl(`<:`)(substitute(a),substitute(b)) R> 1 %isa% Number true R> jl(typeof)(1.0) Float64 R> Float64 %<:% Real true
►Rulia
advanced use in Statistics
- conversion
data.frame
(R
) andDataFrame
(julia
) - conversion
factor
(R
) andCategoricalArrays
(julia
)
1. Data Frame with
Rulia
julia> using DataFrames julia> df = DataFrame(a=1:4, b=repeat([true,false],2)) 4Γ2 DataFrame Row β a b β Int64 Bool ββββββΌββββββββββββββ 1 β 1 true 2 β 2 false 3 β 3 true 4 β 4 false julia> nrow(df), ncol(df) (4, 2) julia> DataFrame <: AbstractArray false
R> ## Done for you: jlusing(DataFrames) R> jl(df = data.frame(a=1:4, b=rep(c(TRUE,FALSE), 2))) R> R(jl(df)) a b 1 1 TRUE 2 2 FALSE 3 3 TRUE 4 4 FALSE R> jl(nrow)(df) 4 R> jl(`<:`)(DataFrame, AbstractArray) false R> DataFrame %<:% AbstractArray false R> df %isa% DataFrame true
2. Facteur with
Rulia
julia> using CategoricalArrays julia> fa = categorical(["titi","toto","titi"]) 3-element CategoricalArray{String,1,UInt32}: "titi" "toto" "titi" julia> levels(fa) 2-element Vector{String}: "titi" "toto" julia> CategoricalArray <: AbstractArray true
R> ## Done for you: jlusing(CategoricalArrays) R> jl(fa = factor(c("titi","toto","titi"))) R> jl(levels)(fa) 2-element Vector{String}: "titi" "toto" R> jl(`<:`)(CategoricalArray, AbstractArray) true R> CategoricalArray %<:% AbstractArray true R> fa %isa% CategoricalArray true
3. Distributions with
Rulia
julia> using Distributions julia> X = Normal() Distributions.Normal{Float64}(ΞΌ=0.0, Ο=1.0) julia> v = rand(X,100); julia> first(v,3) 3-element Vector{Float64}: 0.013900421708920682 -0.733892036458619 -1.083481478375481 julia> fit(Normal,v) Distributions.Normal{Float64}(ΞΌ=0.026161937718674335, Ο=1.0134677955904627)
R> jlusing(Distributions) R> jl(X = `Normal()`) R> v = jl(rand)(X,100L) R> jl(first)(v,3L) 3-element Vector{Float64}: 0.9372061292245324 -0.546258195514108 2.2278904198461857 R> jl(fit)(Normal,v) Normal{Float64}(ΞΌ=0.1076152320761651, Ο=1.0724267852606377)
4. Some ideas about development of
VirtualAgeModelsJL
via Rulia
julia> using Random julia> Random.seed!(13) TaskLocalRNG() julia> using VirtualAgeModels julia> m_sim = @vam(Time & Type ~ (ARAβ(0.4) | Weibull(.001,2.5))); julia> df = rand(m_sim,100); julia> first(df,3) 3Γ2 DataFrame Row β Time Type β Float64 Int64 ββββββΌβββββββββββββββββ 1 β 9.56823 -1 2 β 20.1771 -1 3 β 33.4687 -1 julia> m_mle = @vam(Time & Type ~ (ARAβ(0.4) | Weibull(.001,2.5))); julia> mle(m_mle, df); julia> params(m_mle) 3-element Vector{Float64}: 0.001 2.216030223105565 0.5140226562432094
R> jl_set.seed(13) R> jlusing(VirtualAgeModels) R> jl(m_sim = `@vam(Time & Type ~ (ARAβ(0.4) | Weibull(.001,2.5)))`); R> jl(df = jl(rand)(m_sim,100L)); R> jl(first)(df,3L) 3Γ2 DataFrame Row β Time Type β Float64 Int64 ββββββΌββββββββββββββββ 1 β 20.4348 -1 2 β 21.5315 -1 3 β 26.1267 -1 R> jl(m_mle = `@vam(Time & Type ~ (ARAβ(0.4) | Weibull(.001,2.5)))`); R> invisible(jl(mle)(m_mle, df)) R> jl(params)(m_mle) 3-element Vector{Float64}: 0.001 2.575466578912153 0.3955855367296208
►Comments
A few words on the dΓ©veloppement of Rulia
-
class
jlvalue
as anexternal pointer
-
garbage collector of
julia
called whenever the garbage collector ofR
is cleaning anR
object of classjlvalue
-
function of class
jlfunction
-
errors managed by
R
object of classjlexception
Rulia
limitations
-
Rulia
is a package still experimental and then not considered as stable -
Binary installation only provided for
Windows
. For other OS, installation needs (at least now) aC
compiler (see https://github.com/rcqls/Rulia )