ACP: add `str::from_boxed_utf8` for API consistency

# Proposal

## Problem statement

There's an inconsistency in the standard library's methods for converting a byte slice into a string slice:

| Type       | Unchecked Conversion             | Checked Conversion   |
|------------|----------------------------------|----------------------|
| `&str`     | `str::from_utf8_unchecked`       | `str::from_utf8`     |
| `&mut str` | `str::from_utf8_unchecked_mut`   | `str::from_utf8_mut` |
| `Box<str>` | `str::from_boxed_utf8_unchecked` | *none*               |

While workarounds are possible in both safe and unsafe code, the inconsistency can be a source of confusion, and the workarounds are clunkier than they need to be.

The pull request that added `str::from_boxed_utf8_unchecked` originally was going to add `str::from_boxed_utf8` as well, but did not, as it would drop the allocation. With that concern in mind, instead of returning `Utf8Error`, `str::from_boxed_utf8` could return a `FromUtf8Error`, as that preserves the allocation.

## Motivating examples or use cases

I recently was trying to implement a function to load a struct containing a `Box<str>` from a file, and was able to read it as a `Box<[u8]>`. In order to convert the byte slice into a string slice without rolling my own UTF-8 validation, I was trying to find a checked equivalent to `str::from_boxed_utf8_unchecked`, and was surprised that I didn't find one.

My workaround was [a bit clunky](https://github.com/eliminmax/ial/blob/03afc0739ce61ec0d95f79a84097f8d63add3b68/src/debug_info/parse.rs#L359):

```rust
let label_text = if str::from_utf8(&raw_label_text).is_ok() {
    // SAFETY: Already validated
    unsafe { std::str::from_boxed_utf8_unchecked(raw_label_text) }
} else {
    return Err(Error::NonUtf8Label(raw_label_text));
};
```

## Solution sketch

Add `fn from_boxed_utf8() -> Result<Box<str>, {ERROR}>` to `alloc::str`, where `{ERROR}` is either `core::str::Utf8Error`, or `alloc::string::FromUtf8Error`. My test implementation used the former type, but that was before I saw the concern about the dropped allocation.

If `core::str::validations::run_utf8_validation` were re-exported by `core::str` as part of `str_internals`, it could be a two-line function returning a `Utf8Error`, and only slightly more complex if returning a `FromUtf8Error`.

The `Utf8Error` version:

```rust
pub fn from_boxed_utf8(v: Box<[u8]>) -> Result<Box<str>, Utf8Error> {
    core::str::run_utf8_validation(&v)?;
    // SAFETY: validation succeeded.
    Ok(unsafe { from_boxed_utf8_unchecked(v) })
}
```

A `FromUtf8Error` version that avoids discarding the allocation could look like this, with the current `FromUtf8Error` definition:

```rust
pub fn from_boxed_utf8(v: Box<[u8]>) -> Result<Box<str>, FromUtf8Error> {
    if let Err(error) = core::str::run_utf8_validation(&v) {
        return Err(FromUtf8Error { bytes: Vec::from(v), error });
    }
    // SAFETY: validation succeeded.
    Ok(unsafe { from_boxed_utf8_unchecked(v) })
}
```

Given that [#151822](https://github.com/rust-lang/rust/issues/151822) is working on making `FromUtf8Error` support other `bytes` types (see quote in [Alternatives](#alternatives)), it might make sense to wait for that work to complete, in order to avoid the need to convert the `Box<[u8]>` into a `Vec<u8>` for the error if this approach is taken.

## Alternatives

The first alternative I can think of is simply leaving things as they are. There are at least 2 relatively simple ways to convert a `Box<[u8]>` into a `Box<str>` with UTF-8 validation, without API changes, though one requires `unsafe`:

```rust
// returns `FromUtf8Error` for invalid bytes
String::from_utf8(Vec::from(boxed)).map(String::into_boxed_str)
```

```rust
// returns `Utf8Error` for invalid bytes
str::from_utf8(&boxed)?;
Ok(unsafe { str::from_boxed_utf8_unchecked(boxed) })
```

The second alternative was mentioned by @programmerjake [on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/219381-t-libs/topic/Surprising.20absense.20of.20.60str.3A.3Afrom_boxed_utf8.60/near/570917415):

> actually, [#151822](https://github.com/rust-lang/rust/issues/151822) is busy making FromUtf8Error generic over the bytes type, we could easily add:
Box<str>: TryFrom<Box\[u8], Error = FromUtf8Error<Box<\[u8]>>>

Edit: To be clear, I think that the inconsistency is worth addressing for the sake of a more consistent API. Implementing `TryFrom` would certainly make for a more ergonomic safe conversion, and the current approaches I mentioned make the conversion possible already, but it's still an unexpected (minor) source of potential confusion.

## Links and related work

[GitHub issue comment explaining why `str::from_boxed_utf8` wasn't added originally](https://github.com/rust-lang/rust/pull/41258#issuecomment-296212875).

## What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team [feature lifecycle]. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

[feature lifecycle]: https://std-dev-guide.rust-lang.org/development/feature-lifecycle.html

## Possible responses

The libs team may respond in various different ways. First, the team will consider the *problem* (this doesn't require any concrete solution or alternatives to have been proposed):

- We think this problem seems worth solving, and the standard library might be the right place to solve it.
- We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

- We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
- We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.


Type	Unchecked Conversion	Checked Conversion
`&str`	`str::from_utf8_unchecked`	`str::from_utf8`
`&mut str`	`str::from_utf8_unchecked_mut`	`str::from_utf8_mut`
`Box<str>`	`str::from_boxed_utf8_unchecked`	none

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ACP: add `str::from_boxed_utf8` for API consistency #737

Proposal

Problem statement

Motivating examples or use cases

Solution sketch

Alternatives

Links and related work

What happens now?

Possible responses

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

ACP: add str::from_boxed_utf8 for API consistency #737

Description

Proposal

Problem statement

Motivating examples or use cases

Solution sketch

Alternatives

Links and related work

What happens now?

Possible responses

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

ACP: add `str::from_boxed_utf8` for API consistency #737