Support alternative alphabets in BaseXX encodings

# Feature or enhancement

RFC 4648 describes two alphabets for Base64 (standard and urlsafe) and two alphabets for base32 (standard and hexadecimal). Python also implements three variants of Base85 (Ascii85 is more complex than this, but it can be based on Base85). A number of other formats are based on BaseXX encoding with alternative alphabets.

So, I suggest to adde the alphabet parameter in several `binascii` functions. They can be used in the implementation of the `base64` module or directly by users implementing alternative formats.

We can remove just added functions `b2a_z85()` and `a2b_z85()` -- they are equivalent of `b2a_base85()` and `a2b_base85()` with an alternative alphabet. Also, Base64 with alternative alphabets will be more efficient for large data. Accidentally, this also fixes #145968.

For encoding functions we can simply pass a bytes object containing all alphabet characters. Decoding functions need a reverse table of length which maps a byte to its index or special invalid values. We can provide a function which creates such table from the alphabet.

Alternatively, we can create it automatically from the passed alphabet argument and cache the result. This is less flexible but more user friendly interface. It adds some overhead for small input data, because you need to calculate a hash of the 64- or 85-bytes object, but for large data this is insignificant.


### Linked PRs
* gh-145981
* gh-146230

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support alternative alphabets in BaseXX encodings #145980

Feature or enhancement

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Support alternative alphabets in BaseXX encodings #145980

Description

Feature or enhancement

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions