PragPub 2010-12: Issue #18 by The Pragmatic Bookshelf

PragPub 2010-12: Issue #18 by The Pragmatic Bookshelf

Author:The Pragmatic Bookshelf
Language: eng
Format: epub, mobi
Tags: PragPub—Monthly Magazine
Publisher: The Pragmatic Bookshelf, LLC
Published: 2010-12-01T05:00:00+00:00


Multinationalization

Ruby is a citizen of the world, and the world speaks many languages and uses many different sets of characters doing it. Older Rubies ignores this—to them, strings were just sequences of 8-bit bytes. Ruby 1.9 changes this—saying that Ruby 1.9 is encoding aware is a bit like saying that Google has some servers. Many languages say they support international character sets because they have Unicode support. Well, so does Ruby. But Ruby also has support for 94 other encodings, from plain old ASCII, through SJIS, to KOI8, to old favorites like 8859-1.

What does it mean to support these encodings? Well, first it means that strings, regular expressions, and symbols are suddenly a lot smarter. Rather than being sequences of 8-bit bytes, they’re now sequences of characters. For example, in UTF-8, the string ∂og has three characters, but is represented as five bytes internally. If we run the following program with Ruby 1.8:

str = "∂og"

puts str.length

puts str[0]

puts str.reverse

We see the following:

5

226

go???

Notice that the length is the number of bytes in the string, the first character is returned as an integer, and the reversed string is mangled. But run it with Ruby 1.9, and you see something very different:

3

go∂

But, to make this work, I had to do one extra thing. Remember that Ruby supports almost 100 encodings. How did it know what encoding I’d used for the source code of this program? I had to tell it. In 1.9, every source file in your program can potentially have its own encoding. If you use anything other than 7-bit ASCII in a file, you have to tell Ruby that file’s encoding using a comment on the first line of the file (or the second line if the first line is a shebang). The actual program I ran looked like this:

# encoding: utf-8

str = "∂og"

puts str.length

puts str[0]

puts str.reverse

This per-file encoding is very cool—it means that you can knit together code written using different encodings by people working all over the world, and Ruby will just do the right thing. To my knowledge, that’s unique among programming languages.

But encoding support doesn’t just stop with program source code. When you open a file or other external data source, you can tell Ruby the encoding to use. All data read from that source will be tagged with that encoding. The same applies to data you write out. Behind the scenes, Ruby works hard to make sure that when you work with this data, you’re doing things that make sense—it’ll raise an exception, for instance, if you try to match a SJIS string using a UTF-8 regular expression.

All this great support means that Ruby is incredibly well suited for writing true international applications.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Popular ebooks
Eco-friendly approach of bio-indigo synthesis and developing purification methods towards isolation of indigo from indirubin and bacterial fragments by Ramalingam Manivannan & Kaliyan Prabakaran & Young-A Son(207561)
Personalized inhaled bacteriophage therapy for treatment of multidrug-resistant Pseudomonas aeruginosa in cystic fibrosis by unknow(176007)
CONSORT 2025 statement: updated guideline for reporting randomized trials by unknow(84427)
Critical evaluation of the ProfiLER-02 study design and outcomes by Vivek Subbiah & Razelle Kurzrock(84057)
Cardiac gene therapy makes a comeback by Oliver J. Müller & Susanne Hille & Anca Kliesow Remes(83827)
Whisky: Malt Whiskies of Scotland (Collins Little Books) by dominic roskrow(74440)
Unveiling the design rules for tunable emission in graphene quantum dots: A high-throughput TDDFT and machine learning perspective by Şener Özönder & Mustafa Coşkun Özdemir & Caner Ünlü(50893)
A yeast-based oral therapeutic delivers immune checkpoint inhibitors to reduce intestinal tumor burden by unknow(40262)
Covalent hitchhikers guide proteins to the nucleus by Alexander F. Russell & Madeline F. Currie & Champak Chatterjee(40216)
Meet the Authors: Christopher R. Mansfield and Emily R. Derbyshire by Christopher R. Mansfield & Emily R. Derbyshire(40096)
Alkaline-earth metals promote propane dehydrogenation with carbon dioxide through geometric effects: Altering the reaction pathway by unknow(32733)
Induced iron vacancies boosting FeOOH loaded on sustainable Fenton-like collagen fiber membrane for efficient removal of emerging contaminants by unknow(32508)
Efficient electric-field-assisted photochemical conversion of methane to n-propanol exclusively over penetrated TiO2Ti hollow fibers by Guanghui Feng(32454)
Bi2SiO5 nanosheets as piezo-photocatalyst for efficient degradation of 2,4-Dichlorophenol by Hangyu Shi & Yifu Li & Lishan Zhang & Guoguan Liu & Qian Zhang & Xuan Ru & Shan Zhong(32387)
A novel NDIPTA organic heterojunction photocatalyst with built-in electric field for efficient hydrogen production by Jiahui Yang & Baojun Ma & Yongfa Zhu(32361)
Enhanced conversion of methane to liquid-phase oxygenates via hollow ferrite nanotube@horseradish peroxidase based photoenzymatic catalysis by Jun Duan & Shiying Fan & Xinyong Li & Shaomin Liu(32333)
Ordered macroporous superstructure of defective carbon adorned with tiny cobalt sulfide for selective electrocatalytic hydrogenation of cinnamaldehyde by Xiao-Shi Yuan & Sheng-Hua Zhou & San-Mei Wang & Wenbo Wei & Xiaofang Li & Xin-Tao Wu & Qi-Long Zhu(32257)
What's Done in Darkness by Kayla Perrin(27150)
Topological analysis of non-conjugated ethylene oxide cored dendrimers decorated with tetraphenylethylene: Insights from degree-based descriptors using the polynomial approach by A Theertha Nair & D Antony Xavier & Annmaria Baby & S Akhila(26523)
Investigation of mechanical and self-healing properties of hydroxyl-terminated polybutadiene functionalized with 2-ureido-4-pyrimidinone by Mohsen Kazazi & Mehran Hayaty & Ali Mousaviazar(26458)