Rulia (github): une vie polyamoureuse entre R et Julia
R(ulia) Drouilhet
https://cqls.dyndoc.fr/Rulia/talk
►Introduction
►Why I love R and julia?
R: (single) dispatching, keyword arguments, simplicity, metaprogramming, unique package managerjulia: almost same thing + performance (multiple dispatching, J.I.T. compilation,tree of Types)
R language (my first love π)- (single) dispatching (S3): as an alternative (equivalent) to OOP
- keyword arguments: named parameters with default values equivalent to several functions within unique function definition
- simplicity: few data structures and concept of attributes (metadata)
- metaprogramming: expressiveness, LISP-like programming (code is data)
- package manager: unique system
julia language: worthy successor of R, an UNIQUE language hybrid of R/C++- the tryptic:
- multiple dispatching : action differing among signature (not only the first argument type)
- Just In Time (J.I.T.) compilation
- tree of Types (any variable has a type, even a type)
- performance: recovery (direct and indirect) of the for loop (
C++-like) - expressiveness:
- metaprogramming via macro system
- keyword arguments (not considered in the signature)
- package manager: unique system (based on a package itself:
Pkg.jl)
►Why the package Rulia (julia for R)?
- an evidence (for me):
juliaan UNIQUE language combining expressiviness ofRand performance ofC++ Rulia(alternative ofJuliaCall) for creation ofRpackage to "wrap"juliapackage without an dependence toRcppandRCall.jl(unlikeJuliaCall)
History of
Ruliajl4R(the previous name) started about 10 years ago (whenJuliaCallis 7 years old), conversions restricted to communication betweenR vectorandjulia Array- when was the start of "replacing
RcppwithRulia" :-
creation of
VirtualAgeModels.jl(juliapackage) inspired fromVAM(Rpackage based onRcpp) -
an evidence:
juliaan UNIQUE language combining expressiviness ofRand performance ofC++
-
creation of
- intensification of the development of
Rulia:-
for creation of
Rpackage to "wrap"juliapackage -
no dependencies on
RcppandRCall.jl(unlikeJuliaCall) -
Ruser of the "wrapper" has no need to knowjulia(as forRcpp)
-
for creation of
Final goal of
Rulia: VirtualAgeModelsJL simulationjulia> using VirtualAgeModels julia> m_sim = @vam(Time & Type ~ (ARAβ(0.4) | Weibull(.001,2.5))); julia> params(m_sim) 3-element Vector{Float64}: 0.001 2.5 0.4 julia> df = rand(m_sim,100); julia> first(df,3) 3Γ2 DataFrame Row β Time Type β Float64 Int64 ββββββΌββββββββββββββββ 1 β 12.3536 -1 2 β 14.778 -1 3 β 30.3098 -1
R> jl_set.seed(13) R> require(VirtualAgeModelsJL) R> m_sim = vam(Time & Type ~ (ARAInf(.4) | Weibull(.001,2.5))) R> params(m_sim) [1] 0.001 2.500 0.400 R> df = rand(m_sim,100) R> head(df,3) Time Type 1 20.43478 -1 2 21.53152 -1 3 26.12670 -1
Final goal of
Rulia: VirtualAgeModelsJL estimation MLEjulia> using VirtualAgeModels julia> m_mle = @vam(Time & Type ~ (ARAβ(0.4) | Weibull(.001,2.5))); julia> mle(m_mle, df); julia> params(m_mle) 3-element Vector{Float64}: 0.001 2.207031809102441 0.5353646517580017
R> require(VirtualAgeModelsJL) R> m_mle = vam(Time & Type ~ (ARAInf(.4) | Weibull(.001,2.5))) R> mle(m_mle, data = df) [1] 0.0010000 2.5754666 0.3955855 R> params(m_mle) [1] 0.0010000 2.5754666 0.3955855
►Rulia (julia for R)
►Use of Rulia
- Workflow
juliafunction(s) creationjuliafunction(s) call(s) withinR- 2 conversions required:
Rtojuliato fulfill argumentsjuliatoRto use the results withinR
Use of
Rulia- Worflow step 1:
juliafunction(s) creation:-
juliapackage creation (then inR, _`jlusing(_package_)`_) -
juliafile creation (then inR, _`jlinclude(_file.jl_)`_) -
"inline"
juliafunction creation (likeRcppinline mode) directly withinR(then inR, _`jl(_func_)`_)
-
- Worflow step 2:
juliafunction(s) call(s) withinR-
conversion
Rtojuliavia _`jl(_R_object_)`_ (in fact _`jlvalue(_R_object_)`_ -
juliafunction call withinR(via _`jl(_func_)(...)`_) -
conversion
juliatoR(via _`R(_jlvalue_object_)`_)
-
conversion
3 user modes for
Ruliajlmode ("safe" and user-friendly) :-
only based on an unique
jl()function -
goals:
-
coding within
Ras close as possible tojuliacode -
use of backtick ("`") to enter
juliacontent -
if possible, only one
jl()function to express ajuliacall
-
coding within
-
only based on an unique
jlevalmode ("safe") :-
"safe" by using a system of exception (
try/catch) -
main functions:
jleval()etjlcall()
-
"safe" by using a system of exception (
jlvaluemode ("unsafe"):-
conversions with
jlvalue()and _`jlvalue_...()`_ function call: -
not "safe" for function call since use of
API CofRandjulia - developer use only (you have to know that the action is correct)
-
conversions with
1. Code
julia (mode jl)julia> [1,3,4] 3-element Vector{Int64}: 1 3 4 julia> VERSION v"1.9.3"
R> jl(`[1,3,4]`) 3-element Vector{Int64}: 1 3 4 R> jl(VERSION) # ou jl(`VERSION`) v"1.9.3"
1. Code
julia (mode jleval)julia> [1,3,4] 3-element Vector{Int64}: 1 3 4 julia> VERSION v"1.9.3"
R> jleval("[1,3,4]") 3-element Vector{Int64}: 1 3 4 R> jleval("VERSION") v"1.9.3"
1. Code
julia (mode jlvalue)julia> [1,3,4] 3-element Vector{Int64}: 1 3 4 julia> VERSION v"1.9.3"
R> jlvalue_eval("[1,3,4]") 3-element Vector{Int64}: 1 3 4 R> jlvalue_eval("VERSION") v"1.9.3"
2. Code
julia multilines (mode jl)julia> f(x,y) = x + y f (generic function with 1 method) julia> (f(2,3), f(1.0,3)) (5, 4.0)
R> jl(` + f(x,y) = x + y + (f(2,3), f(1.0,3)) + `) (5, 4.0)
2. Code
julia multilines (mode jleval)julia> f(x,y) = x + y f (generic function with 1 method) julia> (f(2,3), f(1.0,3)) (5, 4.0)
R> jleval(" + f(x,y) = x + y + (f(2,3), f(1.0,3)) + ") (5, 4.0)
2. Code
julia multilines (mode jlvalue)julia> f(x,y) = x + y f (generic function with 1 method) julia> (f(2,3), f(1.0,3)) (5, 4.0)
R> jlvalue_eval(" + f(x,y) = x + y + (f(2,3), f(1.0,3)) + ") (5, 4.0)
3. Conversion (basic)
R to julia (mode jl)julia> t = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof.(t) (Bool, Int64, Float64, String)
R> jl(TRUE) true R> jl(1L) 1 R> jl(1) 1.0 R> jl("1.0") "1.0"
3. Conversion (basic)
R to julia (mode jlvalue)julia> t = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof.(t) (Bool, Int64, Float64, String)
R> jlvalue(TRUE) true R> jlvalue(1L) 1 R> jlvalue(1) 1.0 R> jlvalue("1.0") "1.0"
4. Conversion (vector)
R to julia (mode jl)julia> a = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof(a) Tuple{Bool, Int64, Float64, String}
R> jl(c(TRUE, 1L, 1, "1.0")) 4-element Vector{String}: "TRUE" "1" "1" "1.0" R> jl(list(TRUE, 1L, 1, "1.0")) (true, 1, 1.0, "1.0")
4. Conversion (vector)
R to julia (mode jlvalue)julia> a = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof(a) Tuple{Bool, Int64, Float64, String}
R> jlvalue(c(TRUE, 1L, 1, "1.0")) 4-element Vector{String}: "TRUE" "1" "1" "1.0" R> jlvalue(list(TRUE, 1L, 1, "1.0")) (true, 1, 1.0, "1.0")
5.
julia variables within R (mode jl)julia> a = [true, 1, 1.0, "1.0"] 4-element Vector{Any}: true 1 1.0 "1.0" julia> typeof(a) Vector{Any} (alias for Array{Any, 1}) julia> b = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof(b) Tuple{Bool, Int64, Float64, String}
R> ## Native julia objects R> jl(a = `[true, 1, 1.0, "1.0"]`) R> jl(a) 4-element Vector{Any}: true 1 1.0 "1.0" R> jl()$a 4-element Vector{Any}: true 1 1.0 "1.0" R> jl(b = `(true, 1, 1.0, "1.0")`) R> jl(b) (true, 1, 1.0, "1.0") R> jl()$b (true, 1, 1.0, "1.0") R> ## with conversion R -> julia R> jl(b2 = list(TRUE, 1L, 1, "1.0")) R> jl(b2) (true, 1, 1.0, "1.0")
5.
julia variables within R (mode jleval)julia> a = [true, 1, 1.0, "1.0"] 4-element Vector{Any}: true 1 1.0 "1.0" julia> typeof(a) Vector{Any} (alias for Array{Any, 1}) julia> b = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof(b) Tuple{Bool, Int64, Float64, String}
R> jleval('a =[true, 1, 1.0, "1.0"]') 4-element Vector{Any}: true 1 1.0 "1.0" R> jleval('a') 4-element Vector{Any}: true 1 1.0 "1.0" R> jleval('b = (true, 1, 1.0, "1.0")') (true, 1, 1.0, "1.0") R> jleval('b') (true, 1, 1.0, "1.0") R> ## error below don't crash R> jleval('b = (true, 1, 1.0, "1.0"') Julia Exception: ErrorException
5.
julia variables within R (mode jlvalue)julia> a = [true, 1, 1.0, "1.0"] 4-element Vector{Any}: true 1 1.0 "1.0" julia> typeof(a) Vector{Any} (alias for Array{Any, 1}) julia> b = (true, 1, 1.0, "1.0") (true, 1, 1.0, "1.0") julia> typeof(b) Tuple{Bool, Int64, Float64, String}
R> jlvalue_eval('a =[true, 1, 1.0, "1.0"]') 4-element Vector{Any}: true 1 1.0 "1.0" R> jlvalue_eval('a') 4-element Vector{Any}: true 1 1.0 "1.0" R> jlvalue_eval('b = (true, 1, 1.0, "1.0")') (true, 1, 1.0, "1.0") R> jlvalue_eval('b') (true, 1, 1.0, "1.0") R> ## error below will crash badly R> # jlvalue_eval('b = (true, 1, 1.0, "1.0"')
6.
julia function call within R (mode jl)julia> sum sum (generic function with 10 methods) julia> typeof(sum) typeof(sum) (singleton type of function sum, subtype of Function) julia> sum([1,3,2]) 6 julia> sum([1,3,2], init=4) 10 julia> isa(sum,Function) true julia> sum isa Function true
R> jl(sum) sum (generic function with 10 methods) R> class(jl(sum)) [1] "typeof(sum)" "jlfunction" R> jl(typeof)(sum) typeof(sum) (singleton type of function sum, subtype of Function) R> jl(sum)(`[1,3,2]`) 6 R> jl(sum)(c(1,3,2)) 6.0 R> jl(sum)(c(1,3,2), init=4) 10.0 R> jl(isa)( `sum`,`Function`) true R> jl(isa)(sum,Function) true
6.
julia function call within R (mode jleval)julia> sum sum (generic function with 10 methods) julia> typeof(sum) typeof(sum) (singleton type of function sum, subtype of Function) julia> sum([1,3,2]) 6 julia> sum([1,3,2], init=4) 10 julia> isa(sum,Function) true julia> sum isa Function true
R> jleval("sum") sum (generic function with 10 methods) R> jleval("typeof(sum)") typeof(sum) (singleton type of function sum, subtype of Function) R> jlcall("sum", jleval("[1,3,2]")) 6 R> jlcall("sum", c(1,3,2), init=4) 10.0 R> jlcall("isa", jleval("sum"),jleval("Function")) true R> jleval("sum isa Function") true
6.
julia function call within R (mode jlvalue)julia> sum sum (generic function with 10 methods) julia> typeof(sum) typeof(sum) (singleton type of function sum, subtype of Function) julia> sum([1,3,2]) 6 julia> sum([1,3,2], init=4) 10 julia> isa(sum,Function) true julia> sum isa Function true
R> jlvalue_eval("sum") sum (generic function with 10 methods) R> jlvalue_eval("typeof(sum)") typeof(sum) (singleton type of function sum, subtype of Function) R> jlvalue_call("sum",jlvalue_eval("[1,3,2]")) 6 R> ## Not possible: jlvalue_call("sum", jlvalue([1,3,2]), init=4)") R> jlvalue_eval("isa(sum,Function)") true R> jlvalue_eval("sum isa Function") true
7. Conversion
julia to RR> jl(a = `[1,3,2]`) R> R(jl(a)) [1] 1 3 2 R> jl(sum)(a) 6 R> jl(sum)(a) 6 R> R(jl(sum)(a)) [1] 6 R> R(jl(sum)(c(1,3,2), init=4)) [1] 10 R> R(jlcall("sum", c(1,3,2), init=4)) [1] 10
►Rulia: jl mode more in details
Rmetaprogramming allows us to offer a format very similar to originaljuliacode- no string to call
juliafunction
1. Mode
Règle 1: jl+: fonctionnement de jl()jl(`...`): if argument of jl() is a name (between backticks) then julia code execution and the returned result if of R class jlvalue
R> jl(`10`) 10 R> jl(`'1'`) '1': ASCII/Unicode U+0031 (category Nd: Number, decimal digit) R> jl(`"1"`) "1" R> jl(`(a=[1,"toto", '1'], b=12//13)`) (a = Any[1, "toto", '1'], b = 12//13) R> class(jl(`'1'`)) [1] "Char" "jlvalue" R> class(jl(`"1"`)) [1] "String" "Struct" "jlvalue"
1. Mode
Règle 2: if argument of jl+: fonctionnement de jl()jl() is already a jlvalue, the returned resukt is itself
R> jl(`10`) 10 R> jl(jl(`10`)) 10 R> jlval10 <- jl(`10`) R> class(jlval10) [1] "Int64" "jlvalue" R> jl(jlval10) 10 R> R(jlval10) [1] 10
1. Mode
Règle 3: if argument of jl+: fonctionnement de jl()jl() is an R object, it is converted within julia and an jlvalue external pointer is returned (thanks to a redirection to jlvalue())
R> c(jl(10), jl(10L), jl("10"), jl(TRUE)) ## ou c(jlvalue(10), jlvalue(10L), jlvalue("10"), jlvalue(TRUE)) [[1]] 10.0 [[2]] 10 [[3]] "10" [[4]] true
1. Mode
Règle 4: jl+: fonctionnement de jl()R variable comme argument de jl() higher priority than a julia variable with the same name
R> jl(a = 10) R> a <- 12 R> jl(a) 12.0 R> jl()$a 10.0 R> jl(b = 11) R> b Erreur dans evalq({ : objet 'b' introuvable R> jl(b) 11.0 R> jl()$b 11.0
1. Mode
Règle 5: if argument of jl+: fonctionnement de jl()jl() is a julia expression (content without backticks), julia object is returned as a jlvalue (in fact in R, `abc` and abc is the same, so jl(sin) is the same than jl(`sin`)) )
R> jl(sin) sin (generic function with 14 methods) R> jl(Array) Array R> jl(Vector) Vector (alias for Array{T, 1} where T) R> jl(jl(Vector)) Vector (alias for Array{T, 1} where T) R> class(jl(Vector)) [1] "UnionAll" "jlfunction" R> ## jl(Vector{Float64}) Not possible since Vector{Float64} is not an R expression R> jl(`Vector{Float64}`) Vector{Float64} (alias for Array{Float64, 1})
1. Mode
Règle 6:
jl+: fonctionnement de jl()-
if
jl()returns ajuliafunction, it is then converted to ajlfunctionthat can be executed withinR -
when calling a
jlfunction, in afriendly manner,jl()is applied implicitly to arguments.
R> jl(sin) sin (generic function with 14 methods) R> class(jl(sin)) [1] "typeof(sin)" "jlfunction" R> jl(sin)(13) # conversion R -> julia 0.4201670368266409 R> jl(sin)(`13`) # no conversion 0.4201670368266409 R> jl(`sqrt β +`) sqrt β (+) R> class(jl(`sqrt β +`)) [1] "ComposedFunction{typeof(sqrt), typeof(+)}" "jlfunction" R> jl(`sqrt β +`)(3, 6) 3.0
1. Mode
Règle 7: if argument of jl+: fonctionnement de jl()jl() is a name with bad julia expression, a jlexception is returned
R> jl(sin)(`[1,2]`) Julia Exception: MethodError R> class(jl(sin)(`[1,2]`)) [1] "MethodError" "jlexception" "jlvalue"
2. Broadcasting with
Ruliajulia> sin.([1,4,2]) 3-element Vector{Float64}: 0.8414709848078965 -0.7568024953079282 0.9092974268256817 julia> broadcast(sin, [1,4,2]) 3-element Vector{Float64}: 0.8414709848078965 -0.7568024953079282 0.9092974268256817
R> jl(broadcast)(sin, c(1,4,2)) 3-element Vector{Float64}: 0.8414709848078965 -0.7568024953079282 0.9092974268256817
3. Splat with
Ruliajulia> Ξ£(x...) = sum(x) Ξ£ (generic function with 1 method) julia> Ξ£(1,3,2,4.0), Ξ£(1,3,2) (10.0, 6) julia> julia> somme = splat(+) splat(+) julia> somme([1,4,2.0]) 7.0
R> ## DΓ©finition inside julia R> jl(`Ξ£(x...) = sum(x)`) Ξ£ (generic function with 1 method) R> jl(Ξ£)(1,3,2,4.0) 10.0 R> jl(Ξ£)(1,3,2) 6.0 R> ## Directly in R R> Ξ£2 <- jl(splat)(`+`) R> Ξ£2(c(1,4,2)) 7.0 R> jl(somme = jl(splat)(`+`) ) R> jl(somme) # saved inside R splat(+) R> jl(somme)(c(1,4,2)) 7.0
4. Operators
isa and <: with Ruliajulia> isa(1, Number) true julia> isa(1.0, Real) true julia> 1.0 isa Real true julia> typeof(1.0) Float64 julia> <:(typeof(1.0),Real) true julia> typeof(1.0) <: Real true
R> jl(isa)(1, Number) # 1 is converted to 1.0 true R> jl(`<:`)(jl(typeof)(1.0), Real) true R> "%isa%" <- function(a, b) jl(isa)(substitute(a),substitute(b)) R> "%<:%" <- function(a, b) jl(`<:`)(substitute(a),substitute(b)) R> 1 %isa% Number true R> jl(typeof)(1.0) Float64 R> Float64 %<:% Real true
►Rulia advanced use in Statistics
- conversion
data.frame(R) andDataFrame(julia) - conversion
factor(R) andCategoricalArrays(julia)
1. Data Frame with
Ruliajulia> using DataFrames julia> df = DataFrame(a=1:4, b=repeat([true,false],2)) 4Γ2 DataFrame Row β a b β Int64 Bool ββββββΌββββββββββββββ 1 β 1 true 2 β 2 false 3 β 3 true 4 β 4 false julia> nrow(df), ncol(df) (4, 2) julia> DataFrame <: AbstractArray false
R> ## Done for you: jlusing(DataFrames) R> jl(df = data.frame(a=1:4, b=rep(c(TRUE,FALSE), 2))) R> R(jl(df)) a b 1 1 TRUE 2 2 FALSE 3 3 TRUE 4 4 FALSE R> jl(nrow)(df) 4 R> jl(`<:`)(DataFrame, AbstractArray) false R> DataFrame %<:% AbstractArray false R> df %isa% DataFrame true
2. Facteur with
Ruliajulia> using CategoricalArrays julia> fa = categorical(["titi","toto","titi"]) 3-element CategoricalArray{String,1,UInt32}: "titi" "toto" "titi" julia> levels(fa) 2-element Vector{String}: "titi" "toto" julia> CategoricalArray <: AbstractArray true
R> ## Done for you: jlusing(CategoricalArrays) R> jl(fa = factor(c("titi","toto","titi"))) R> jl(levels)(fa) 2-element Vector{String}: "titi" "toto" R> jl(`<:`)(CategoricalArray, AbstractArray) true R> CategoricalArray %<:% AbstractArray true R> fa %isa% CategoricalArray true
3. Distributions with
Ruliajulia> using Distributions julia> X = Normal() Distributions.Normal{Float64}(ΞΌ=0.0, Ο=1.0) julia> v = rand(X,100); julia> first(v,3) 3-element Vector{Float64}: 0.013900421708920682 -0.733892036458619 -1.083481478375481 julia> fit(Normal,v) Distributions.Normal{Float64}(ΞΌ=0.026161937718674335, Ο=1.0134677955904627)
R> jlusing(Distributions) R> jl(X = `Normal()`) R> v = jl(rand)(X,100L) R> jl(first)(v,3L) 3-element Vector{Float64}: 0.9372061292245324 -0.546258195514108 2.2278904198461857 R> jl(fit)(Normal,v) Normal{Float64}(ΞΌ=0.1076152320761651, Ο=1.0724267852606377)
4. Some ideas about development of
VirtualAgeModelsJL via Ruliajulia> using Random julia> Random.seed!(13) TaskLocalRNG() julia> using VirtualAgeModels julia> m_sim = @vam(Time & Type ~ (ARAβ(0.4) | Weibull(.001,2.5))); julia> df = rand(m_sim,100); julia> first(df,3) 3Γ2 DataFrame Row β Time Type β Float64 Int64 ββββββΌβββββββββββββββββ 1 β 9.56823 -1 2 β 20.1771 -1 3 β 33.4687 -1 julia> m_mle = @vam(Time & Type ~ (ARAβ(0.4) | Weibull(.001,2.5))); julia> mle(m_mle, df); julia> params(m_mle) 3-element Vector{Float64}: 0.001 2.216030223105565 0.5140226562432094
R> jl_set.seed(13) R> jlusing(VirtualAgeModels) R> jl(m_sim = `@vam(Time & Type ~ (ARAβ(0.4) | Weibull(.001,2.5)))`); R> jl(df = jl(rand)(m_sim,100L)); R> jl(first)(df,3L) 3Γ2 DataFrame Row β Time Type β Float64 Int64 ββββββΌββββββββββββββββ 1 β 20.4348 -1 2 β 21.5315 -1 3 β 26.1267 -1 R> jl(m_mle = `@vam(Time & Type ~ (ARAβ(0.4) | Weibull(.001,2.5)))`); R> invisible(jl(mle)(m_mle, df)) R> jl(params)(m_mle) 3-element Vector{Float64}: 0.001 2.575466578912153 0.3955855367296208
►Comments
A few words on the dΓ©veloppement of Rulia
-
class
jlvalueas anexternal pointer -
garbage collector of
juliacalled whenever the garbage collector ofRis cleaning anRobject of classjlvalue -
function of class
jlfunction -
errors managed by
Robject of classjlexception
Rulia limitations
-
Ruliais a package still experimental and then not considered as stable -
Binary installation only provided for
Windows. For other OS, installation needs (at least now) aCcompiler (see https://github.com/rcqls/Rulia )