ActiveSupport::Multibyte::Unicode

Methods

C

compose

D

decompose

T

tidy_bytes

Constants

UNICODE_VERSION	=	RbConfig::CONFIG["UNICODE_VERSION"]
	The `Unicode` version that is supported by the implementation

Instance Public methods

compose(codepoints) Link

Compose decomposed characters to the composed form.

Source: show | on GitHub

# File activesupport/lib/active_support/multibyte/unicode.rb, line 21
def compose(codepoints)
  codepoints.pack("U*").unicode_normalize(:nfc).codepoints
end

decompose(type, codepoints) Link

Decompose composed characters to the decomposed form.

Source: show | on GitHub

# File activesupport/lib/active_support/multibyte/unicode.rb, line 12
def decompose(type, codepoints)
  if type == :compatibility
    codepoints.pack("U*").unicode_normalize(:nfkd).codepoints
  else
    codepoints.pack("U*").unicode_normalize(:nfd).codepoints
  end
end

tidy_bytes(string, force = false) Link

Replaces all ISO-8859-1 or CP1252 characters by their UTF-8 equivalent resulting in a valid UTF-8 string.

Passing true will forcibly tidy all bytes, assuming that the string’s encoding is entirely CP1252 or ISO-8859-1.

Source: show | on GitHub

# File activesupport/lib/active_support/multibyte/unicode.rb, line 30
def tidy_bytes(string, force = false)
  return string if string.empty? || string.ascii_only?
  return recode_windows1252_chars(string) if force
  string.scrub { |bad| recode_windows1252_chars(bad) }
end

Module ActiveSupport::Multibyte::Unicode

Constants

Instance Public methods

compose(codepoints) Link

decompose(type, codepoints) Link

tidy_bytes(string, force = false) Link