WordPress with Chinese language inserts garbled characters into .htaccess file causing it to break

This was originally posted in Wordpress Bug report & comment, but after going through various tries to resolve the problem, I think it could be an environment issue. This happens in Wordpress 6.3.2 (as on bug report) and 6.4.2 (this is the version I have tested). Here is the link to the original bug report & comment. Basically, the problem is: I have a Wordpress site with Language set to zh_TW (Chinese Taiwan). Most of time it works fine, but occasionally I get Server error 500. It seems that Wordpress is adding Chinese UTF-8 comments into the .htaccess, but that comment block is not correctly encoded resulting in garbled characters. Sample of the broken .htaccess # BEGIN Wordpress # 在含有 BEGIN Wordpress 及 END Wordpress 標記的這� �行間的指示詞� �容為動� �產生, # 且應� 有 WordPress 篩選器能進行修改。對這� �行間任何指示詞� �容的變更, # 都會遭到系統覆寫。 <IfModule mod_expires.c> Expected .htaccess should be # BEGIN Wordpress # 在含有 BEGIN Wordpress 及 END Wordpress 標記的這兩行間的指示詞內容為動態產生, # 且應僅有 WordPress 篩選器能進行修改。對這兩行間任何指示詞內容的變更, # 都會遭到系統覆寫。 <IfModule mod_expires.c> This code is added by wp-admin/includes/misc.php After lots of investigation, I have found that only some characters causes the problem. I have tried manually inserting the Chinese character into the misc.php code. I first tried changing this code: $instructions = sprintf( __( 'The directives (lines) between "BEGIN %1$s" and "END %1$s" are dynamically generated, and should only be modified via WordPress filters. Any changes to the directives between these markers will be overwritten.' ), $marker ); Changing simply to this below would break the htaccess $instructions = '# 兩'; But changing to this below, would NOT $instructions = '# 我'; I have checked my php.ini default_charset = "UTF-8", both input_encoding and output_encoding are not set. Another test case I tried was to manually insert # 兩 somewhere in the .htaccess. This also causes the problem after insert_with_markers is executed. After reading through the code, it seems the code is reading and re-writing the entire file. This seem to indicate to me that the problem occurs when the strings are re-written into .htaccess. Forcing the code to do mb_convert_encoding($line, 'UTF-8') on every single line also doesn't seem to make a difference as suggested on How to convert a file to UTF-8 in php?. I have tried to also force the code to write the UTF-8 BOM at the beginning of the file, but Apache also fail with http 500 upon reading the BOM. Lastly, I copied out the function, wrote it as a stand-alone php file to execute on the command line in the server environment. This does **NOT **reproduce the problem, and results in a useable .htaccess file. I had to disable switch_to_locale, but this didn't seem to make any difference even if I disabled it in misc.php. So what is different about executing this code directly on the server vs via CGI. phpinfo did not reveal any difference in default_charset and other encoding values. Any thoughts on how I can deal with this or what else to look at? Here is some setting information: Server is Linux 4.19.286 PHP 7.4.33 exif.encode_unicode = IOS-8859-15 default_charset = UTF-8
