結論
builtinで複数のRubyコードを使うことができるようになる話。ようはRubyでRubyを高速化するのがよりやりやすくなる話。
以前書いたこの記事をもっと深堀りしたものなので、もっとソフトな話を読みたい方はこちらを読んでいただければと思います。
前提
RubyでRubyのメソッドを実装することが最近のCRubyではできます。
で、いくつかのメソッドはRubyで実装を書くことで高速化を図ることができるケースがあります。
詳しくはこちらの記事を読んで頂ければと思います。
ただ、RubyでRubyを実装する際には複数のRubyコードを使用することは現状できません。
複数のRubyコードを使って実装する場合、こんな感じでCのコードなどを部分的に使用することがあります。
class TrueClass # Document-class: TrueClass # # The global value <code>true</code> is the only instance of class # TrueClass and represents a logically true value in # boolean expressions. The class provides operators allowing # <code>true</code> to be used in logical expressions. # # # call-seq: # true.to_s -> "true" # # The string representation of <code>true</code> is "true". # def to_s Primitive.attr! 'inline' Primitive.cexpr! 'rb_cTrueClass_to_s' end def inspect Primitive.attr! 'inline' Primitive.cexpr! 'rb_cTrueClass_to_s' end # # call-seq: # true & obj -> true or false # # And---Returns <code>false</code> if <i>obj</i> is # <code>nil</code> or <code>false</code>, <code>true</code> otherwise. # def &(obj) obj ? true : false end # # call-seq: # true ^ obj -> !obj # # Exclusive Or---Returns <code>true</code> if <i>obj</i> is # <code>nil</code> or <code>false</code>, <code>false</code> # otherwise. # def ^(obj) obj ? false : true end # # call-seq: # true | obj -> true # # Or---Returns <code>true</code>. As <i>obj</i> is an argument to # a method call, it is always evaluated; there is no short-circuit # evaluation in this case. # # true | puts("or") # true || puts("logical or") # # <em>produces:</em> # # or # def |(bool) true end end
module Kernel # # call-seq: # obj.class -> class # # Returns the class of <i>obj</i>. This method must always be called # with an explicit receiver, as #class is also a reserved word in # Ruby. # # 1.class #=> Integer # self.class #=> Object #-- # Equivalent to \c Object\#class in Ruby. # # Returns the class of \c obj, skipping singleton classes or module inclusions. #++ # def class Primitive.attr! 'inline' Primitive.cexpr! 'rb_obj_class(self)' end # # call-seq: # obj.clone(freeze: nil) -> an_object # # Produces a shallow copy of <i>obj</i>---the instance variables of # <i>obj</i> are copied, but not the objects they reference. # #clone copies the frozen value state of <i>obj</i>, unless the # +:freeze+ keyword argument is given with a false or true value. # See also the discussion under Object#dup. # # class Klass # attr_accessor :str # end # s1 = Klass.new #=> #<Klass:0x401b3a38> # s1.str = "Hello" #=> "Hello" # s2 = s1.clone #=> #<Klass:0x401b3998 @str="Hello"> # s2.str[1,4] = "i" #=> "i" # s1.inspect #=> "#<Klass:0x401b3a38 @str=\"Hi\">" # s2.inspect #=> "#<Klass:0x401b3998 @str=\"Hi\">" # # This method may have class-specific behavior. If so, that # behavior will be documented under the #+initialize_copy+ method of # the class. # def clone(freeze: nil) Primitive.rb_obj_clone2(freeze) end # # call-seq: # obj.frozen? -> true or false # # Returns the freeze status of <i>obj</i>. # # a = [ "a", "b", "c" ] # a.freeze #=> ["a", "b", "c"] # a.frozen? #=> true #-- # Determines if the object is frozen. Equivalent to \c Object\#frozen? in Ruby. # \param[in] obj the object to be determines # \retval Qtrue if frozen # \retval Qfalse if not frozen #++ # def frozen? Primitive.attr! 'inline' Primitive.cexpr! 'rb_obj_frozen_p(self)' end # # call-seq: # obj.tap {|x| block } -> obj # # Yields self to the block, and then returns self. # The primary purpose of this method is to "tap into" a method chain, # in order to perform operations on intermediate results within the chain. # # (1..10) .tap {|x| puts "original: #{x}" } # .to_a .tap {|x| puts "array: #{x}" } # .select {|x| x.even? } .tap {|x| puts "evens: #{x}" } # .map {|x| x*x } .tap {|x| puts "squares: #{x}" } # #-- # \private #++ # def tap yield(self) self end # # call-seq: # obj.then {|x| block } -> an_object # # Yields self to the block and returns the result of the block. # # 3.next.then {|x| x**x }.to_s #=> "256" # # Good usage for +then+ is value piping in method chains: # # require 'open-uri' # require 'json' # # construct_url(arguments). # then {|url| open(url).read }. # then {|response| JSON.parse(response) } # # When called without block, the method returns +Enumerator+, # which can be used, for example, for conditional # circuit-breaking: # # # meets condition, no-op # 1.then.detect(&:odd?) # => 1 # # does not meet condition, drop value # 2.then.detect(&:odd?) # => nil # def then unless Primitive.block_given_p return Primitive.cexpr! 'SIZED_ENUMERATOR(self, 0, 0, rb_obj_size)' end yield(self) end # # call-seq: # obj.yield_self {|x| block } -> an_object # # Yields self to the block and returns the result of the block. # # "my string".yield_self {|s| s.upcase } #=> "MY STRING" # # Good usage for +then+ is value piping in method chains: # # require 'open-uri' # require 'json' # # construct_url(arguments). # then {|url| open(url).read }. # then {|response| JSON.parse(response) } # def yield_self unless Primitive.block_given_p return Primitive.cexpr! 'SIZED_ENUMERATOR(self, 0, 0, rb_obj_size)' end yield(self) end module_function # # call-seq: # Float(arg, exception: true) -> float or nil # # Returns <i>arg</i> converted to a float. Numeric types are # converted directly, and with exception to String and # <code>nil</code> the rest are converted using # <i>arg</i><code>.to_f</code>. Converting a String with invalid # characters will result in a ArgumentError. Converting # <code>nil</code> generates a TypeError. Exceptions can be # suppressed by passing <code>exception: false</code>. # # Float(1) #=> 1.0 # Float("123.456") #=> 123.456 # Float("123.0_badstring") #=> ArgumentError: invalid value for Float(): "123.0_badstring" # Float(nil) #=> TypeError: can't convert nil into Float # Float("123.0_badstring", exception: false) #=> nil # def Float(arg, exception: true) Primitive.rb_f_float(arg, exception) end end
またCのコードをRubyで使用する際には、以下のように自動生成されるヘッダーを読み込む必要があります。
今回のケースだとTrueClass
とKernel
の二つでRubyの実装を使用するようにしているので、それらが定義されているobject.c
内で以下のように自動生成されたヘッダーを読み込む必要があります。
#include "kernel.rbinc" #include "trueclass.rbinc"
kernel.rbinc
などの.rbinc
は、make
でビルドする際に自動生成されているヘッダーです。
で、この自動生成されているヘッダーで定義されているCの関数がネックとなっています。
CRubyではmake
する際に、Rubyコード内のCコードをチェックしてヘッダーへと自動変換する処理が行われています(つまりRubyで実装した個所をCへと変換してたりする)
具体的な処理の箇所としてはcommon.mk
では
{$(srcdir)}.rb.rbinc: $(ECHO) making $@ $(Q) $(BASERUBY) $(tooldir)/mk_builtin_loader.rb $< builtin_binary.inc: $(PREP) $(BUILTIN_RB_SRCS) $(srcdir)/template/builtin_binary.inc.tmpl $(Q) $(MINIRUBY) $(tooldir)/generic_erb.rb -o $@ \ $(srcdir)/template/builtin_binary.inc.tmpl -- --cross=$(CROSS_COMPILING)
mk_builtin_loader.rb
をビルドする際に実行するようにしています。
またtool/mk_builtin_loader.rb
内部ではビルド時にRubyのソースコード(builtinで配置されたもの。ex: array.rbなど)を引数に受け取り、それをもとにヘッダーを自動生成しています。
# Parse built-in script and make rbinc file require 'ripper' require 'stringio' require_relative 'ruby_vm/helpers/c_escape' def string_literal(lit, str = []) while lit case lit.first when :string_concat, :string_embexpr, :string_content _, *lit = lit lit.each {|s| string_literal(s, str)} return str when :string_literal _, lit = lit when :@tstring_content str << lit[1] return str else raise "unexpected #{lit.first}" end end end def inline_text argc, arg1 raise "argc (#{argc}) of inline! should be 1" unless argc == 1 arg1 = string_literal(arg1) raise "1st argument should be string literal" unless arg1 arg1.join("").rstrip end def make_cfunc_name inlines, name, lineno case name when /\[\]/ name = '_GETTER' when /\[\]=/ name = '_SETTER' else name = name.tr('!?', 'EP') end base = "builtin_inline_#{name}_#{lineno}" if inlines[base] 1000.times{|i| name = "#{base}_#{i}" return name unless inlines[name] } raise "too many functions in same line..." else base end end def collect_locals tree type, name, (line, cols) = tree if locals = LOCALS_DB[[name, line]] locals else if false # for debugging pp LOCALS_DB raise "not found: [#{name}, #{line}]" end end end def collect_builtin base, tree, name, bs, inlines, locals = nil while tree call = recv = sep = mid = args = nil case tree.first when :def locals = collect_locals(tree[1]) tree = tree[3] next when :defs locals = collect_locals(tree[3]) tree = tree[5] next when :class name = 'class' tree = tree[3] next when :sclass, :module name = 'class' tree = tree[2] next when :method_add_arg _, mid, (_, (_, args)) = tree case mid.first when :call _, recv, sep, mid = mid when :fcall _, mid = mid else mid = nil end when :vcall _, mid = tree when :command # FCALL _, mid, (_, args) = tree when :call, :command_call # CALL _, recv, sep, mid, (_, args) = tree end if mid raise "unknown sexp: #{mid.inspect}" unless mid.first == :@ident _, mid, (lineno,) = mid if recv func_name = nil case recv.first when :var_ref _, recv = recv if recv.first == :@const and recv[1] == "Primitive" func_name = mid.to_s end when :vcall _, recv = recv if recv.first == :@ident and recv[1] == "__builtin" func_name = mid.to_s end end collect_builtin(base, recv, name, bs, inlines) unless func_name else func_name = mid[/\A__builtin_(.+)/, 1] end if func_name cfunc_name = func_name args.pop unless (args ||= []).last argc = args.size if /(.+)\!\z/ =~ func_name case $1 when 'attr' text = inline_text(argc, args.first) if text != 'inline' raise "Only 'inline' is allowed to be annotated (but got: '#{text}')" end break when 'cstmt' text = inline_text argc, args.first func_name = "_bi#{inlines.size}" cfunc_name = make_cfunc_name(inlines, name, lineno) inlines[cfunc_name] = [lineno, text, locals, func_name] argc -= 1 when 'cexpr', 'cconst' text = inline_text argc, args.first code = "return #{text};" func_name = "_bi#{inlines.size}" cfunc_name = make_cfunc_name(inlines, name, lineno) locals = [] if $1 == 'cconst' inlines[cfunc_name] = [lineno, code, locals, func_name] argc -= 1 when 'cinit' text = inline_text argc, args.first func_name = nil # required inlines[inlines.size] = [lineno, text, nil, nil] argc -= 1 end end if bs[func_name] && bs[func_name] != [argc, cfunc_name] raise "same builtin function \"#{func_name}\", but different arity (was #{bs[func_name]} but #{argc})" end bs[func_name] = [argc, cfunc_name] if func_name end break unless tree = args end tree.each do |t| collect_builtin base, t, name, bs, inlines, locals if Array === t end break end end # ruby mk_builtin_loader.rb TARGET_FILE.rb # #=> generate TARGET_FILE.rbinc # LOCALS_DB = {} # [method_name, first_line] = locals def collect_iseq iseq_ary # iseq_ary.each_with_index{|e, i| p [i, e]} label = iseq_ary[5] first_line = iseq_ary[8] type = iseq_ary[9] locals = iseq_ary[10] insns = iseq_ary[13] if type == :method LOCALS_DB[[label, first_line].freeze] = locals end insns.each{|insn| case insn when Integer # ignore when Array # p insn.shift # insn name insn.each{|op| if Array === op && op[0] == "YARVInstructionSequence/SimpleDataFormat" collect_iseq op end } end } end def generate_cexpr(ofile, lineno, line_file, body_lineno, text, locals, func_name) f = StringIO.new f.puts '{' lineno += 1 locals.reverse_each.with_index{|param, i| next unless Symbol === param f.puts "MAYBE_UNUSED(const VALUE) #{param} = rb_vm_lvar(ec, #{-3 - i});" lineno += 1 } f.puts "#line #{body_lineno} \"#{line_file}\"" lineno += 1 f.puts text lineno += text.count("\n") + 1 f.puts "#line #{lineno + 2} \"#{ofile}\"" # TODO: restore line number. f.puts "}" f.puts lineno += 3 return lineno, f.string end def mk_builtin_header file base = File.basename(file, '.rb') ofile = "#{file}inc" # bs = { func_name => argc } code = File.read(file) collect_iseq RubyVM::InstructionSequence.compile(code).to_a collect_builtin(base, Ripper.sexp(code), 'top', bs = {}, inlines = {}) begin f = open(ofile, 'w') rescue Errno::EACCES # Fall back to the current directory f = open(File.basename(ofile), 'w') end begin if File::ALT_SEPARATOR file = file.tr(File::ALT_SEPARATOR, File::SEPARATOR) ofile = ofile.tr(File::ALT_SEPARATOR, File::SEPARATOR) end lineno = __LINE__ f.puts "// -*- c -*-" f.puts "// DO NOT MODIFY THIS FILE DIRECTLY." f.puts "// auto-generated file" f.puts "// by #{__FILE__}" f.puts "// with #{file}" f.puts '#include "internal/compilers.h" /* for MAYBE_UNUSED */' f.puts '#include "internal/warnings.h" /* for COMPILER_WARNING_PUSH */' f.puts '#include "ruby/ruby.h" /* for VALUE */' f.puts '#include "builtin.h" /* for RB_BUILTIN_FUNCTION */' f.puts 'struct rb_execution_context_struct; /* in vm_core.h */' f.puts lineno = __LINE__ - lineno - 1 line_file = file inlines.each{|cfunc_name, (body_lineno, text, locals, func_name)| if String === cfunc_name f.puts "static VALUE #{cfunc_name}(struct rb_execution_context_struct *ec, const VALUE self)" lineno += 1 lineno, str = generate_cexpr(ofile, lineno, line_file, body_lineno, text, locals, func_name) f.write str else # cinit! f.puts "#line #{body_lineno} \"#{line_file}\"" lineno += 1 f.puts text lineno += text.count("\n") + 1 f.puts "#line #{lineno + 2} \"#{ofile}\"" # TODO: restore line number. lineno += 1 end } bs.each_pair{|func, (argc, cfunc_name)| decl = ', VALUE' * argc argv = argc \ . times \ . map {|i|", argv[#{i}]"} \ . join('') f.puts %'static void' f.puts %'mjit_compile_invokebuiltin_for_#{func}_#{cfunc_name}(FILE *f, long index, unsigned stack_size, bool inlinable_p)' f.puts %'{' f.puts %' fprintf(f, " VALUE self = GET_SELF();\\n");' f.puts %' fprintf(f, " typedef VALUE (*func)(rb_execution_context_t *, VALUE#{decl});\\n");' if inlines.has_key? cfunc_name body_lineno, text, locals, func_name = inlines[cfunc_name] lineno, str = generate_cexpr(ofile, lineno, line_file, body_lineno, text, locals, func_name) f.puts %' if (inlinable_p) {' str.gsub(/^(?!#)/, ' ').each_line {|i| j = RubyVM::CEscape.rstring2cstr(i).dup j.sub!(/^ return\b/ , ' val =') f.printf(%' fprintf(f, "%%s", %s);\n', j) } f.puts(%' return;') f.puts(%' }') end if argc > 0 f.puts %' if (index == -1) {' f.puts %' fprintf(f, " const VALUE *argv = &stack[%d];\\n", stack_size - #{argc});' f.puts %' }' f.puts %' else {' f.puts %' fprintf(f, " const unsigned int lnum = GET_ISEQ()->body->local_table_size;\\n");' f.puts %' fprintf(f, " const VALUE *argv = GET_EP() - lnum - VM_ENV_DATA_SIZE + 1 + %ld;\\n", index);' f.puts %' }' end f.puts %' fprintf(f, " func f = (func)%"PRIdPTR"; /* == #{cfunc_name} */\\n", (intptr_t)#{cfunc_name});' f.puts %' fprintf(f, " val = f(ec, self#{argv});\\n");' f.puts %'}' f.puts } f.puts "void Init_builtin_#{base}(void)" f.puts "{" table = "#{base}_table" f.puts " // table definition" f.puts " static const struct rb_builtin_function #{table}[] = {" bs.each.with_index{|(func, (argc, cfunc_name)), i| f.puts " RB_BUILTIN_FUNCTION(#{i}, #{func}, #{cfunc_name}, #{argc}, mjit_compile_invokebuiltin_for_#{func}_#{cfunc_name})," } f.puts " RB_BUILTIN_FUNCTION(-1, NULL, NULL, 0, 0)," f.puts " };" f.puts f.puts " // arity_check" f.puts "COMPILER_WARNING_PUSH" f.puts "#if GCC_VERSION_SINCE(5, 1, 0) || __clang__" f.puts "COMPILER_WARNING_ERROR(-Wincompatible-pointer-types)" f.puts "#endif" bs.each{|func, (argc, cfunc_name)| f.puts " if (0) rb_builtin_function_check_arity#{argc}(#{cfunc_name});" } f.puts "COMPILER_WARNING_POP" f.puts f.puts " // load" f.puts " rb_load_with_builtin_functions(#{base.dump}, #{table});" f.puts "}" ensure f.close end end ARGV.each{|file| # feature.rb => load_feature.inc mk_builtin_header file }
またこの時、Primitive.cexpr! 'rb_obj_frozen_p(self)'
のようなCのコードを見つけるとcompile.c
のcompile_call
関数が呼び出され、関数名を自動生成してくれます(長いので関数名を自動生成している部分だけ切り抜いています)
int inline_index = GET_VM()->builtin_inline_index++; snprintf(inline_func, 0x20, "_bi%d", inline_index); // <= ここでRubyのコード内で呼び出されたCのコードの関数名を作成している
これは、Ruby内部のCコードを実行するための関数を定義するためです。
たとえばPrimitive.cexpr! 'rb_obj_frozen_p(self)'
はこんな感じでCの関数が定義されます。
static void mjit_compile_invokebuiltin_for__bi1(FILE *f, long index, unsigned stack_size, bool inlinable_p) { fprintf(f, " VALUE self = GET_SELF();\n"); fprintf(f, " typedef VALUE (*func)(rb_execution_context_t *, VALUE);\n"); if (inlinable_p) { fprintf(f, "%s", " {\n"); fprintf(f, "%s", "#line 69 \"../ruby/kernel.rb\"\n"); fprintf(f, "%s", " return rb_obj_frozen_p(self);\n"); fprintf(f, "%s", "#line 50 \"../ruby/kernel.rbinc\"\n"); fprintf(f, "%s", " }\n"); fprintf(f, "%s", " \n"); return; } fprintf(f, " func f = (func)%"PRIdPTR"; /* == builtin_inline_class_69 */\n", (intptr_t)builtin_inline_class_69); fprintf(f, " val = f(ec, self);\n"); }
ちなみに複数個所でCコードが埋め込まれている場合は、mjit_compile_invokebuiltin_for__bi1
、mjit_compile_invokebuiltin_for__bi2
のように連番となります。
ここまでが前提です。
起きたこと
Cのコードを連番で関数として定義しているため、複数のRubyコードでRubyを実装するとコンパイル時にエラーになる。
まあ、なんというかCの世界では想定できそうな話ではある感じです。
Cでは同じ名前の関数を定義してエラーになってしまうことがままあります(大体の場合はちょっと違う関数名つけて回避する)
int func() { return 42; } int func() { return 21; } int main() { return func(); }
なので、こういうコードはコンパイルできません。
で、今回のようにTrueClass
とKernel
をそれぞれ別のRubyコードで実装した場合はそれぞれのソースコード毎に読み込んでtool/mk_builtin_loader.rb
が実行されます。
つまり、内部にCコードの埋め込みがある場合でmjit_compile_invokebuiltin_for__bi1
というコードが別々のヘッダーにそれぞれ定義されてしまうということです。
ここで思い出してほしいのですが、複数のRubyコードを読み込んでCRuby自体を実装する場合
#include "kernel.rbinc" #include "trueclass.rbinc"
このようにそれぞれのヘッダーを読み込みます。
対応策
この記事では文字列をハッシュ値に変換して渡すという方法を取りました。これでビルドはできるようになります。
ですが、ハッシュ値が衝突する可能性も全くないわけではないのであまり好ましい実装ではないです。
そこで、tool/mk_builtin_loader.rb
でCの関数を生成している箇所を修正してみます。
# Parse built-in script and make rbinc file require 'ripper' require 'stringio' require_relative 'ruby_vm/helpers/c_escape' def string_literal(lit, str = []) while lit case lit.first when :string_concat, :string_embexpr, :string_content _, *lit = lit lit.each {|s| string_literal(s, str)} return str when :string_literal _, lit = lit when :@tstring_content str << lit[1] return str else raise "unexpected #{lit.first}" end end end def inline_text argc, arg1 raise "argc (#{argc}) of inline! should be 1" unless argc == 1 arg1 = string_literal(arg1) raise "1st argument should be string literal" unless arg1 arg1.join("").rstrip end def make_cfunc_name inlines, name, lineno case name when /\[\]/ name = '_GETTER' when /\[\]=/ name = '_SETTER' else name = name.tr('!?', 'EP') end base = "builtin_inline_#{name}_#{lineno}" if inlines[base] 1000.times{|i| name = "#{base}_#{i}" return name unless inlines[name] } raise "too many functions in same line..." else base end end def collect_locals tree type, name, (line, cols) = tree if locals = LOCALS_DB[[name, line]] locals else if false # for debugging pp LOCALS_DB raise "not found: [#{name}, #{line}]" end end end def collect_builtin base, tree, name, bs, inlines, locals = nil while tree call = recv = sep = mid = args = nil case tree.first when :def locals = collect_locals(tree[1]) tree = tree[3] next when :defs locals = collect_locals(tree[3]) tree = tree[5] next when :class name = 'class' tree = tree[3] next when :sclass, :module name = 'class' tree = tree[2] next when :method_add_arg _, mid, (_, (_, args)) = tree case mid.first when :call _, recv, sep, mid = mid when :fcall _, mid = mid else mid = nil end when :vcall _, mid = tree when :command # FCALL _, mid, (_, args) = tree when :call, :command_call # CALL _, recv, sep, mid, (_, args) = tree end if mid raise "unknown sexp: #{mid.inspect}" unless mid.first == :@ident _, mid, (lineno,) = mid if recv func_name = nil case recv.first when :var_ref _, recv = recv if recv.first == :@const and recv[1] == "Primitive" func_name = mid.to_s end when :vcall _, recv = recv if recv.first == :@ident and recv[1] == "__builtin" func_name = mid.to_s end end collect_builtin(base, recv, name, bs, inlines) unless func_name else func_name = mid[/\A__builtin_(.+)/, 1] end if func_name cfunc_name = func_name args.pop unless (args ||= []).last argc = args.size if /(.+)\!\z/ =~ func_name case $1 when 'attr' text = inline_text(argc, args.first) if text != 'inline' raise "Only 'inline' is allowed to be annotated (but got: '#{text}')" end break when 'cstmt' text = inline_text argc, args.first func_name = "_bi#{inlines.size}" cfunc_name = make_cfunc_name(inlines, name, lineno) inlines[cfunc_name] = [lineno, text, locals, func_name] argc -= 1 when 'cexpr', 'cconst' text = inline_text argc, args.first code = "return #{text};" func_name = "_bi#{inlines.size}" cfunc_name = make_cfunc_name(inlines, name, lineno) locals = [] if $1 == 'cconst' inlines[cfunc_name] = [lineno, code, locals, func_name] argc -= 1 when 'cinit' text = inline_text argc, args.first func_name = nil # required inlines[inlines.size] = [lineno, text, nil, nil] argc -= 1 end end if bs[func_name] && bs[func_name] != [argc, cfunc_name] raise "same builtin function \"#{func_name}\", but different arity (was #{bs[func_name]} but #{argc})" end bs[func_name] = [argc, cfunc_name] if func_name end break unless tree = args end tree.each do |t| collect_builtin base, t, name, bs, inlines, locals if Array === t end break end end # ruby mk_builtin_loader.rb TARGET_FILE.rb # #=> generate TARGET_FILE.rbinc # LOCALS_DB = {} # [method_name, first_line] = locals def collect_iseq iseq_ary # iseq_ary.each_with_index{|e, i| p [i, e]} label = iseq_ary[5] first_line = iseq_ary[8] type = iseq_ary[9] locals = iseq_ary[10] insns = iseq_ary[13] if type == :method LOCALS_DB[[label, first_line].freeze] = locals end insns.each{|insn| case insn when Integer # ignore when Array # p insn.shift # insn name insn.each{|op| if Array === op && op[0] == "YARVInstructionSequence/SimpleDataFormat" collect_iseq op end } end } end def generate_cexpr(ofile, lineno, line_file, body_lineno, text, locals, func_name) f = StringIO.new f.puts '{' lineno += 1 locals.reverse_each.with_index{|param, i| next unless Symbol === param f.puts "MAYBE_UNUSED(const VALUE) #{param} = rb_vm_lvar(ec, #{-3 - i});" lineno += 1 } f.puts "#line #{body_lineno} \"#{line_file}\"" lineno += 1 f.puts text lineno += text.count("\n") + 1 f.puts "#line #{lineno + 2} \"#{ofile}\"" # TODO: restore line number. f.puts "}" f.puts lineno += 3 return lineno, f.string end def mk_builtin_header file base = File.basename(file, '.rb') ofile = "#{file}inc" # bs = { func_name => argc } code = File.read(file) collect_iseq RubyVM::InstructionSequence.compile(code).to_a collect_builtin(base, Ripper.sexp(code), 'top', bs = {}, inlines = {}) begin f = open(ofile, 'w') rescue Errno::EACCES # Fall back to the current directory f = open(File.basename(ofile), 'w') end begin if File::ALT_SEPARATOR file = file.tr(File::ALT_SEPARATOR, File::SEPARATOR) ofile = ofile.tr(File::ALT_SEPARATOR, File::SEPARATOR) end lineno = __LINE__ f.puts "// -*- c -*-" f.puts "// DO NOT MODIFY THIS FILE DIRECTLY." f.puts "// auto-generated file" f.puts "// by #{__FILE__}" f.puts "// with #{file}" f.puts '#include "internal/compilers.h" /* for MAYBE_UNUSED */' f.puts '#include "internal/warnings.h" /* for COMPILER_WARNING_PUSH */' f.puts '#include "ruby/ruby.h" /* for VALUE */' f.puts '#include "builtin.h" /* for RB_BUILTIN_FUNCTION */' f.puts 'struct rb_execution_context_struct; /* in vm_core.h */' f.puts lineno = __LINE__ - lineno - 1 line_file = file inlines.each{|cfunc_name, (body_lineno, text, locals, func_name)| if String === cfunc_name f.puts "static VALUE #{cfunc_name}(struct rb_execution_context_struct *ec, const VALUE self)" lineno += 1 lineno, str = generate_cexpr(ofile, lineno, line_file, body_lineno, text, locals, func_name) f.write str else # cinit! f.puts "#line #{body_lineno} \"#{line_file}\"" lineno += 1 f.puts text lineno += text.count("\n") + 1 f.puts "#line #{lineno + 2} \"#{ofile}\"" # TODO: restore line number. lineno += 1 end } bs.each_pair{|func, (argc, cfunc_name)| decl = ', VALUE' * argc argv = argc \ . times \ . map {|i|", argv[#{i}]"} \ . join('') f.puts %'static void' f.puts %'mjit_compile_invokebuiltin_for_#{base}_#{func}(FILE *f, long index, unsigned stack_size, bool inlinable_p)' f.puts %'{' f.puts %' fprintf(f, " VALUE self = GET_SELF();\\n");' f.puts %' fprintf(f, " typedef VALUE (*func)(rb_execution_context_t *, VALUE#{decl});\\n");' if inlines.has_key? cfunc_name body_lineno, text, locals, func_name = inlines[cfunc_name] lineno, str = generate_cexpr(ofile, lineno, line_file, body_lineno, text, locals, func_name) f.puts %' if (inlinable_p) {' str.gsub(/^(?!#)/, ' ').each_line {|i| j = RubyVM::CEscape.rstring2cstr(i).dup j.sub!(/^ return\b/ , ' val =') f.printf(%' fprintf(f, "%%s", %s);\n', j) } f.puts(%' return;') f.puts(%' }') end if argc > 0 f.puts %' if (index == -1) {' f.puts %' fprintf(f, " const VALUE *argv = &stack[%d];\\n", stack_size - #{argc});' f.puts %' }' f.puts %' else {' f.puts %' fprintf(f, " const unsigned int lnum = GET_ISEQ()->body->local_table_size;\\n");' f.puts %' fprintf(f, " const VALUE *argv = GET_EP() - lnum - VM_ENV_DATA_SIZE + 1 + %ld;\\n", index);' f.puts %' }' end f.puts %' fprintf(f, " func f = (func)%"PRIdPTR"; /* == #{cfunc_name} */\\n", (intptr_t)#{cfunc_name});' f.puts %' fprintf(f, " val = f(ec, self#{argv});\\n");' f.puts %'}' f.puts } f.puts "void Init_builtin_#{base}(void)" f.puts "{" table = "#{base}_table" f.puts " // table definition" f.puts " static const struct rb_builtin_function #{table}[] = {" bs.each.with_index{|(func, (argc, cfunc_name)), i| f.puts " RB_BUILTIN_FUNCTION(#{i}, #{func}, #{cfunc_name}, #{argc}, mjit_compile_invokebuiltin_for_#{base}_#{func})," } f.puts " RB_BUILTIN_FUNCTION(-1, NULL, NULL, 0, 0)," f.puts " };" f.puts f.puts " // arity_check" f.puts "COMPILER_WARNING_PUSH" f.puts "#if GCC_VERSION_SINCE(5, 1, 0) || __clang__" f.puts "COMPILER_WARNING_ERROR(-Wincompatible-pointer-types)" f.puts "#endif" bs.each{|func, (argc, cfunc_name)| f.puts " if (0) rb_builtin_function_check_arity#{argc}(#{cfunc_name});" } f.puts "COMPILER_WARNING_POP" f.puts f.puts " // load" f.puts " rb_load_with_builtin_functions(#{base.dump}, #{table});" f.puts "}" ensure f.close end end ARGV.each{|file| # feature.rb => load_feature.inc mk_builtin_header file }
変更した個所はに箇所です。
f.puts %'mjit_compile_invokebuiltin_for_#{func}(FILE *f, long index, unsigned stack_size, bool inlinable_p)'
をf.puts %'mjit_compile_invokebuiltin_for_#{base}_#{func}(FILE *f, long index, unsigned stack_size, bool inlinable_p)'
に変更し、f.puts " RB_BUILTIN_FUNCTION(#{i}, #{func}, #{cfunc_name}, #{argc}, mjit_compile_invokebuiltin_for_#{func}),"
をf.puts " RB_BUILTIN_FUNCTION(#{i}, #{func}, #{cfunc_name}, #{argc}, mjit_compile_invokebuiltin_for_#{base}_#{func}),"
に変更しています。
base
にはビルド時に読み込まれた各Rubyのソースコード名が入ります(ex: array.rbならarray)
これによりmjit_compile_invokebuiltin_for_kernel__bi1
のように自動生成される関数名に使用するクラスやモジュール名を追加することができます。
これで複数のRubyコードを使ってCRubyを実装することができます。