胃肠感冒吃什么食物比较好| 弹性工作制是什么意思| 不放屁吃什么药能通气| 身份证更换需要带什么| 粉蒸肉的粉是什么粉| 排骨炖苦瓜有什么功效| lll是什么意思| 爱吃甜食是缺乏什么| balea是什么牌子| tommyhilfiger什么牌子| 夏季喝什么茶好| 肾穿刺是什么意思| 爆菊什么意思| 怄气是什么意思| 蛋白酶是什么东西| 聚酯纤维是什么材料| 逼长什么样| 头发变黄是什么原因| 拉肚子吃什么最好| 发呆是什么意思| 人的五官指什么| g750和au750有什么区别| 女累读什么| 鼻炎用什么药效果好| 米咖色是什么颜色| 张衡发明了什么东西| 端午节在什么时候| 舌头麻木吃什么药| 实质是什么意思| 小叶增生和乳腺增生有什么区别| 甲状腺囊性结节是什么意思| 手发抖是什么原因| 11.2是什么星座| 螃蟹过街的歇后语是什么| 肚子胀是什么原因引起的| 火龙果和什么相克| 倒拔垂杨柳是什么意思| 蓝色与什么色搭配好看| 指甲竖纹是什么原因| 十年粤语版叫什么名字| 吃虫草有什么好处| 喝什么茶好| 淼念什么| 屎壳郎是什么意思| 支气管炎吃什么| 加拿大用什么货币| 运动减肥为什么体重不减反增| 头发变棕色是什么原因| 两情相悦什么意思| 四月二十一是什么星座| 脚趾脱皮是什么原因| au750是什么意思| 索条影是什么意思| 茶宠为什么会变色| 吃什么增加抵抗力| 心脏早搏有什么危险| 1978属什么| 眼睛为什么会长麦粒肿| 如花似玉是什么生肖| 小舅子是什么意思| 什么是酵素| 核酸是什么| 红眼病不能吃什么东西| 面肌痉挛吃什么药效果好| 咖啡加奶有什么坏处和好处| 鬼冢虎为什么很少人穿| hg是什么单位| b超fl是什么意思| 血压低吃什么药| 牙套什么材质的好| aq是什么标准| 贪是什么意思| 卡其色裙子配什么颜色上衣好看| 做梦梦见下大雨是什么意思| 肩胛骨麻麻的什么原因| 嗓子疼挂什么科| 做梦梦到已故的亲人是什么意思| 龙吃什么| 含义是什么意思| 脸上长痘痘去医院挂什么科| pn医学上是什么意思| 剖腹产什么时候可以洗澡| 补血补气吃什么好| 怀孕7天有什么症状| 贫血吃什么补品| 新生儿吐奶是什么原因| 脚上有痣代表什么| 长期尿黄可能是什么病| 肾与性功能有什么关系| 肾气不足吃什么中成药| 膝盖发软无力是什么原因| 忧郁的意思是什么| 一天中什么时候最冷| 嗜血是什么意思| 家财万贯是什么动物| 血压压差小是什么原因| 灵长类动物是指什么| mva是什么单位| 喝什么降火| 安全感是什么| 月经下不来是什么原因| twice什么意思| 1.24是什么星座| 人才辈出是什么意思| 什么是脑死亡| 疾苦的疾是什么意思| 什么是升华| 与虎谋皮什么意思| 胸围85是什么罩杯| 女予念什么| pending是什么意思啊| 拘留所和看守所有什么区别| 小肚子疼是什么情况| 宫腔积液吃什么药效果最好| hot什么意思| 人为什么会得肿瘤| 云字属于五行属什么| single是什么意思| 阴阳人是什么意思| 牙齿挂什么科| 三叉神经痛挂什么科| 什么的光华| 10.14是什么星座| 尴尬什么意思| 马头琴是什么族的乐器| 上火吃什么水果降火快| 什么学步| 杜康原是什么| 工伤是什么意思| 荷花是什么季节开的| 流清鼻涕吃什么药好| pa代表什么意思| 有什么好听的网名| 湿热是什么意思| 鲁迅是什么家| 家政公司是做什么的| 益生菌什么时间段吃效果好| 南瓜炒什么好吃| 什么是道家| 支元体阳性是什么意思| 牛筋面是什么做的| 一面之词是什么意思| 土加亥念什么| 任性什么意思| 白羊和什么星座最配| 奇花初胎矞矞皇皇是什么意思| 人为什么要工作| 3月16日是什么星座| 为什么会发烧| 脖子上长个包挂什么科| 脱靶是什么意思| 孕妇感冒可以吃什么药| 带鱼屏是什么意思| gfr是什么意思| 冬日暖阳是什么意思| 爆菊花是什么意思| 卖什么意思| 太阳穴凹陷是什么原因| 绿色属于五行属什么| 桃子又什么又什么| 骨髓抑制是什么意思| 复读是什么意思| 2005年属鸡的是什么命| 仓鼠可以吃什么| 小腹右边疼是什么原因| 指甲断裂是什么原因| 查结核做什么检查| 什么是胰岛素抵抗| 乳腺增生1类什么意思| 女人眉尾有痣代表什么| 吃什么排便| 377是什么| 一个合一个页读什么| 过敏性鼻炎有什么症状| 什么炒肉| 坐南朝北是什么意思| 上四休二是什么意思| 渗析是什么意思| 睾丸扭转是什么意思| lt是什么意思| 蓝加红是什么颜色| 八字七杀是什么意思| 齐多夫定片是治什么病的| 什么叫热射病| 是什么车| 痛风挂什么科室| 感冒发烧不能吃什么食物| 九个月的宝宝吃什么辅食食谱| 手指甲凹凸不平是什么原因| 泡椒是什么辣椒| 梦见打台球是什么意思| 智齿发炎吃什么消炎药| 吃维生素b2有什么好处| 男人梦见鱼是什么征兆| 肾漏蛋白是什么病| 笃怎么读什么意思| 松花蛋是什么蛋| 川字纹有什么影响| 牙齿什么颜色最健康| 香港脚是什么| 什么意| 遂成大学的遂是什么意思| 什么花一年四季都开| 养肝要吃什么| 七夕之夜是什么生肖| 检查血常规挂什么科| 向日葵是什么意思| mk是什么牌子| calcium是什么意思| 才高八斗什么生肖| camel是什么颜色| 浑身疼是什么原因| 高潮是什么| 吃月饼是什么节日| 凉皮是什么材料做的| 怀孕1个月有什么症状| 面粉可以做什么| 为什么同房后小腹疼痛| 吗啡是什么药| 做四维需要准备什么| 贻笑大方是什么意思| 什么的莲蓬| 礼成是什么意思| 女孩和女人有什么区别| 什么的虾| rf医学上是什么意思| 壁虎怕什么| 肚子疼什么原因| 耳朵一直痒是什么原因| 吃什么对子宫好| 又什么又什么的草地| 面部痒是什么原因| 吃什么对胃好养胃| 中国的国花是什么花| 时迁的绰号是什么| 为什么会脑供血不足| 妇科支原体是什么病| 松鼠咬人后为什么会死| 血常规是检查什么的| 葡萄糖偏低是什么意思| 天孤星是什么意思| ca125高是什么原因| 白细胞偏低是什么原因造成的| 扁平疣是什么| 维生素b6有什么作用| 日语一个一个是什么意思| 什么火锅最好吃| 什么时候不能喷芸苔素| 嘴巴淡而无味是什么原因| 人体缺钾是什么原因引起的| 吃什么有助于骨头恢复| 吃什么水果可以美白| 阴超能检查出什么| 阴中求阳是什么意思| 语素是什么| 为什么月经量少| proof什么意思| 今年26岁属什么生肖| 睡觉开风扇有什么危害| zv是什么品牌| 关节炎吃什么药| screenx影厅是什么| 二氧化碳低是什么原因| 一只眼皮肿是什么原因| 百度Jump to content

芒种是什么季节

From Wikipedia, the free encyclopedia
百度 报道称,作为美国国会中抵制委内瑞拉政府最为积极的两个人物,民主党参议员鲍勃·梅嫩德斯和共和党参议员马尔科·鲁比奥在马杜罗宣布预售石油币后不久就对其后果向特朗普提出了警告。

Punycode is a representation of Unicode with the limited ASCII character subset used for Internet hostnames. Using Punycode, host names containing Unicode characters are transcoded to a subset of ASCII consisting of letters, digits, and hyphens, which is called the letter–digit–hyphen (LDH) subset. For example, the German München (English: Munich) is encoded as Mnchen-3ya.

While the Domain Name System (DNS) technically supports arbitrary sequences of octets in domain name labels, the DNS standards recommend the use of the LDH subset of ASCII conventionally used for host names, and require that string comparisons between DNS domain names should be case-insensitive. The Punycode syntax is a method of encoding strings containing Unicode characters, such as internationalized domain names (IDNA), into the LDH subset of ASCII favored by DNS. It is specified in IETF Request for Comments 3492.[1]

Origin of the name

[edit]

The RFC author, Adam Costello, is reported to have written:

Why “Punycode”? It rhymes with Unicode and is intended to encode Unicode strings. It is “puny” in three senses: The repertoire of characters used in the encoded strings is small, the encoded strings are short, and the implementation is small.[2]

Description

[edit]

As stated in RFC 3492, "Punycode is an instance of a more general algorithm called Bootstring, which allows strings composed from a small set of 'basic' code points to uniquely represent any string of code points drawn from a larger set." Punycode defines parameters for the general Bootstring algorithm to match the characteristics of Unicode text. This section demonstrates the procedure for Punycode encoding, using as an example the German string "bücher" (English: books), which is translated into the label "bcher-kva".

To make the encoding and decoding algorithms simple, no attempt has been made to prevent some encoded values from encoding inadmissible Unicode values: however, these should be checked for and detected during decoding.

Punycode is designed to work across all scripts, and to be self-optimizing by attempting to adapt to the character set ranges within the string as it operates. It is optimized for the case where the string is composed of zero or more ASCII characters and in addition characters from only one other script system, but will cope with any arbitrary Unicode string. Note that for DNS use, the domain name string is assumed to have been normalized using nameprep and (for top-level domains) filtered against an officially registered language table before being punycoded, and that the DNS protocol sets limits on the acceptable lengths of the output Punycode string.

Separation of ASCII characters

[edit]

First, all ASCII characters in the string are copied from input to output, skipping over any other characters. For example, "bücher" is copied to "bcher". If any characters were copied, i.e. if there was at least one ASCII character in the input, an ASCII hyphen is appended to the output (e.g., "bücher" → "bcher-", but "ü" → "").

Note that hyphens are themselves ASCII characters. Thus, they can be present in the input and, if so, they will be copied to the output. This causes no ambiguity: if the output contains hyphens, the one that got added is always the last one. It marks the end of the ASCII characters.

Encoding the non-ASCII characters

[edit]

The non-ASCII characters are sorted by Unicode value, lowest first (if a character occurs more than once, they are sorted by position). Each is then encoded as a single number. This single number defines both the location to insert the character at and which character to insert.

  • An index into the result to insert the code at, starting at 0 (for insertion at the start).[citation needed]
  • The number of insertionPoints (current length of the result plus one).
  • The reducedCodepoint is the Unicode code point to insert minus 127.[citation needed]

The encoded number is insertionPoints × reducedCodepoint + index. By dividing by insertionPoints and also getting the remainder, a decoder can determine reducedCodepoint and index.

There are 6 possible insertion points for a character in the string "bcher" (including before the first character and after the last one). ü is Unicode code point 0xFC or 252 (see Latin-1 Supplement), and the reduced code point is 252 ? 127, or 124. The ü is inserted at position 1, after the b. Thus the encoder will add the number 6 × 124 + 1 = 745, and the decoder can retrieve these by ?745 / 6? = 124 and 745 mod 6 = 1.

These numbers are strictly increasing. For the second and subsequent inserted characters, the difference between the number and the previous one is written.

Variable-length number encoding

[edit]

The number is encoded using the letters a through z and the digits 0 through 9. It is not base-36 but a more complex scheme, generalized variable-length integers, which allows the numbers to be concatenated with nothing separating them.

This is how "kva" is used to represent the code number 745:

A number system with little-endian ordering is used which allows variable-length codes without separate delimiters: a digit lower than a threshold value marks that it is the most-significant digit, hence the end of the number. The threshold value depends on the position in the number and also on previous insertions, to increase efficiency. Correspondingly the weights of the digits vary.

In this case a number system with 36 symbols is used, with the case-insensitive 'a' through 'z' equal to the decimal numbers 0 through 25, and '0' through '9' equal to the decimal numbers 26 through 35. Thus "kva", corresponds to the decimal number string "10 21 0".

To decode this string of symbols, a sequence of thresholds will be needed, in this case it's (1, 1, 26, 26, ...).[3] The weight (or place value) of the least-significant digit is always 1: 'k' (=10) with a weight of 1 equals 10. After this, the weight of the next digit depends on the first threshold: generally, for any n, the weight of the (n+1)-th digit is w × (36 ? t), where w is the previous weight and t is the threshold of the n-th digit. So in this case, the second symbol has a place value of 36 minus the previous threshold value of 1, which equals 35. Therefore, the sum of the first two symbols 'k' (=10) and 'v' (=21) is 10 × 1 + 21 × 35. Since the second symbol is not less than its threshold value of 1, there is more to come. However, since the third symbol in this example is 'a' (=0), we may ignore calculating its weight. Therefore, "kva" represents the decimal number (10 × 1) + (21 × 35) = 745.

Number 745 will be encoded as 10 + 21 × 35 + 0 (base 35 used for second digit, the most significant digit 0 needed as terminator), 10 → 'k', 21 → 'v', 0 → 'a', so "bücher" → "bcher-kva".

The thresholds themselves are determined for each successive encoded character by an algorithm keeping them between 1 and 26 inclusive.[4] The case can then be used to provide information about the original case of the string.[5]

Because special characters are sorted by their code points by encoding algorithm, for the insertion of a second special character in "bücher", the first possibility is "büücher" with code "bcher-kvaa", the second "bücüher" with code "bcher-kvab", etc. After "bücherü" with code "bcher-kvae" comes codes representing insertion of y, the Unicode character following ü, starting with "ybücher" with code "bcher-kvaf" (different from "übücher" coded "bcher-jvab"), etc.

ACE prefix for internationalized domain names

[edit]

To prevent hyphens in non-international domain names from triggering a Punycode decoding, the string xn-- is prepended to Punycode sequences in internationalized domain names. This is called ACE (ASCII Compatible Encoding).[6]

Thus the domain name "bücher.tld" would be represented in a URL as "xn--bcher-kva.tld".

Examples

[edit]

The following table shows examples of Punycode encodings for different types of input.[7]

Input Punycode Description
The empty string.
a a- Only ASCII characters, one, lowercase.
A A- Only ASCII characters, one, uppercase.
3 3- Only ASCII characters, one, a digit.
- -- Only ASCII characters, one, a hyphen.
-- --- Only ASCII characters, two hyphens.
London London- Only ASCII characters, more than one, no hyphens.
Lloyd-Atkinson Lloyd-Atkinson- Only ASCII characters, one hyphen.
This has spaces This has spaces- Only ASCII characters, with spaces.
-> $1.00 <- -> $1.00 <-- Only ASCII characters, mixed symbols.
Б d0a No ASCII characters, one Cyrillic character.
ü tda No ASCII characters, one Latin-1 Supplement character.
α mxa No ASCII characters, one Greek character.
fsq No ASCII characters, one CJK character.
?? n28h No ASCII characters, one emoji character.
αβγ mxacd No ASCII characters, more than one character.
München Mnchen-3ya Mixed string, with one character that is not an ASCII character.
Mnchen-3ya Mnchen-3ya- Double-encoded Punycode of "München".
München-Ost Mnchen-Ost-9db Mixed string, with one character that is not ASCII, and a hyphen.
Bahnhof München-Ost Bahnhof Mnchen-Ost-u6b Mixed string, with one space, one hyphen, and one character that is not ASCII.
ab?cd?ef abcdef-qua4k Mixed string, two non-ASCII characters.
правда 80aafi6cg Russian, without ASCII.
??????? 22cdfh1b8fsa Thai, without ASCII.
??? hq1bm8jm9l Korean, without ASCII.
ドメイン名例 eckwd4c7cu47r2wf Japanese, without ASCII.
MajiでKoiする5秒前 MajiKoi5-783gue6qz075azm5e Japanese with ASCII.
「bücher」 bcher-kva8445foa Mixed non-ASCII scripts (Latin-1 Supplement and CJK).

See also

[edit]

References

[edit]
  1. ^ RFC 3492, Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDN), A. Costello, The Internet Society (March 2003)
  2. ^ Maximillian Laumeister (18 July 2020). "Why Is It Called "Punycode"?". Retrieved 13 January 2025.
  3. ^ This is true for the first encoded character (or, in terms of RFC 3492, the first "delta"): see RFC 3492, Sec. 6.
  4. ^ RFC 3492, Secs. 3.4, 5.
  5. ^ RFC 3492, App. A.
  6. ^ Internet Assigned Numbers Authority (2025-08-07). "Completion of IANA Selection of IDNA Prefix". www.atm.tut.fi. Archived from the original on 2025-08-07. Retrieved 2025-08-07.
  7. ^ The Punycode in this table was created using the builtin codec "punycode" of the Python programming language version 3.8 (s.encode("punycode")). See talk page.
[edit]
为什么小腿肌肉酸痛 爸爸生日礼物送什么 大姨妈来了吃什么对身体好 人怕冷是什么原因引起的 恏是什么意思
釜底抽薪是什么意思 手淫过度吃什么药调理 立冬北方吃什么 什么药能治口臭 什么时候情人节
mds医学上是什么意思 阑尾炎什么症状表现 桢字五行属什么 怕什么来什么 乳房疼痛应该挂什么科
什么叫戒断反应 无缘无故吐血是什么原因 大量出汗是什么原因 手淫多了有什么坏处 什么快递比较快
牛肉和什么不能一起吃naasee.com 子宫肌瘤吃什么药hcv8jop8ns2r.cn 花旦是什么意思hcv9jop0ns0r.cn 斑鸠是什么hcv8jop3ns9r.cn 高血糖可以吃什么hcv8jop1ns3r.cn
内分泌失调是什么原因引起的hcv9jop2ns5r.cn 什么地什么hcv9jop7ns9r.cn 三点水卖读什么aiwuzhiyu.com 社论是什么sscsqa.com 小肚子疼是什么原因女性hcv9jop5ns6r.cn
北芪煲汤加什么药材好mmeoe.com 胸部周围痒是什么原因hcv7jop6ns7r.cn 组织部长是什么级别hcv9jop4ns4r.cn 梦见吵架是什么预兆hcv9jop5ns6r.cn 狸子是什么动物hcv8jop9ns8r.cn
18k金和24k金有什么区别aiwuzhiyu.com 低钾血症是什么意思hcv7jop9ns2r.cn 今年28岁属什么生肖hcv8jop5ns3r.cn OD是什么hcv9jop0ns6r.cn 精子吃了有什么好处hcv8jop3ns6r.cn
百度