became_fish ,
@became_fish@jorts.horse avatar

people say "C++ isn't any closer to the metal than is" but what is the memory footprint of returning a Result? in C++ i can pass an out parameter by reference to avoid allocating anything on the stack - how do i do that in rust? is it with a &mut ptr? is it done automatically by the compiler when returning?

andxor ,
@andxor@mstdn.social avatar

@became_fish yes, you use mutable references for that use cases. They're equivalent to non-const pointers in C++, with the exception that Rust references don't cause undefined behavior

diegovsky ,
@diegovsky@fosstodon.org avatar

@became_fish a result is a very lightweight type tbh. Its stack size is very negligible and you wouldn't need to do that. The compiler is smart enough to optimise that, so it likely compiles down to a single register, or it inlines the function.

If you insist on going that route, though, yes, &mut is the way.

kornel ,
@kornel@mastodon.social avatar

@became_fish Yes, the ABI automatically switches to returning by reference when the Result is too large to spread across registers.
There's also niche optimization, e.g. Result<Box<T>, UnitErr> is just a pointer. More efficient than ABIs of unique_ptr/variant/expected.

became_fish OP ,
@became_fish@jorts.horse avatar

@kornel is there somewhere i can learn what rust is doing under the hood?

kornel ,
@kornel@mastodon.social avatar
ekuber ,
@ekuber@hachyderm.io avatar
became_fish OP ,
@became_fish@jorts.horse avatar

how do i pass around segments of a string without copying? in C++ it's a string_view. in Rust i need to copy the string into a Vec<char>, then use unsafe pointer arithmetic to get a &str from it without copying again?

became_fish OP ,
@became_fish@jorts.horse avatar

like there must be a better way but everyone online just says you need to copy it into a String

image/png

migratory ,
@migratory@jorts.horse avatar

@became_fish you do need to copy into a String in this case because Vec&lt;char&gt; is a very uncommon type to encounter as it's roughly equivalent to a UTF-32 string, while &str is a slice of a UTF-8 string

became_fish OP ,
@became_fish@jorts.horse avatar

@migratory i'm working on a tokenizer so being able to iterate character-by-character is a must, and it seems like Vec<char> is the way to do that. but if i go that route, i have to allocate on the heap to be able to read segments from that vec as a string? is there a way to do this without Vec<char>?

migratory ,
@migratory@jorts.horse avatar

@became_fish it's more idiomatic to accumulate characters into a String than store them in a Vec<char>; iterating over String by character is just the .chars() method

became_fish OP ,
@became_fish@jorts.horse avatar

@migratory but there's a lot of times where i need to look forward in the string without iterating it - like to check if a > symbol is a >= or just a >.

migratory ,
@migratory@jorts.horse avatar

@became_fish I don't know quite how your code looks, but I would expect that to look something like (assuming a string s):

if s[byte_index..].starts_with("&gt;=") { ... }

became_fish OP ,
@became_fish@jorts.horse avatar

@migratory oh you can do that? that's sick. i'll try rewriting it

migratory ,
@migratory@jorts.horse avatar

@became_fish String implements Deref<Target=str> which means it has all the methods str does, including the implementation of the Slice operators, and then str has a bunch of handy query methods like starts_with/find/etc. as well

diegovsky ,
@diegovsky@fosstodon.org avatar

@became_fish @migratory you don't need to convert into a String, you can just use .chars() if you have a &str

diegovsky ,
@diegovsky@fosstodon.org avatar

@became_fish @migratory if you don't want to allocate extra memory but still wants to iterate char-by-char, you can also use the .chars() method and consume one element at a time.

The returned value is a rust iterator that yields a char every time you call .next() on it. It doesn't allocate anything on the heap because it's actually just a pointer to the string + a counter

migratory ,
@migratory@jorts.horse avatar

@became_fish that said the code you have is going to crash because it promises that the Vec<char> data is UTF-8, which it isn't. you want to do chars.iter().collect::&lt;String&gt;() instead of this unsafe block, and then you'll get a String with no Result wrapper

Corax42 ,
@Corax42@mastodon.social avatar

@became_fish Why do you have a Vec<char> to begin with?

Fundamentally, the problem with this setup is that a str is a sequence of utf8 bytes, where each char is variable-length encoded and ends up taking between 1 to 4 bytes. But a slice of chars is a fixed-length encoding, where every char takes 4 bytes. Due to that mismatch, it should be impossible to convert a slice of a Vec<char> into a str without allocating a new buffer.

became_fish OP ,
@became_fish@jorts.horse avatar

@Corax42 check the other reply thread

Corax42 ,
@Corax42@mastodon.social avatar

@became_fish Oh, okay

nanobot248 ,
@nanobot248@mtd.sysblog.at avatar

@became_fish in Rust, a Vec<char> is not a string, so you can't create a string_view aka &str from it. you could create a &[char] slice (via as_slice()) from it. char is more or less equivalent to u32, so Vec<char> is like Vec<u32>.

an std::String is like a utf-8 encoded Vec<u8>. so from that you can create a &[u8] with as_slice() or a &str with as_str().

if you have Vec<char> and want &str, a conversion step (e.g. copy to String) is necessary, as the whole encoding/mem-layout is different.

kornel ,
@kornel@mastodon.social avatar

@became_fish &str is conceptually identical to string_view.
Box<str> is the same ptr+len, but heap allocated.
Vec<char> is UTF-32, and generally avoided.
Rust has no copy constructors. You can't copy heap types by mistake. It's like RVO guaranteed everywhere all the time.

became_fish OP ,
@became_fish@jorts.horse avatar

@kornel but constructing a String from a &str copies the &str, right?

kornel ,
@kornel@mastodon.social avatar

@became_fish Yes, because String is heap allocated by definition. It's {ptr, capacity, len}.

&str is ambivalent about the data source, and guarantees to never run any destructor or free() on the pointer.

Cow<str> holds a bool that tracks whether to free or not.

Corax42 ,
@Corax42@mastodon.social avatar

@became_fish @kornel Yes, necessarily, because each String is the unique owner of its buffer, and the &str is borrowed from another owner.

andxor ,
@andxor@mstdn.social avatar

@became_fish that would be a slice:

let s = "abc";
foo(&s[1..2]); // read-only slice

let mut s = String::from("abc");
bar(&mut s[1..2]); // read-write slice

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • test
  • worldmews
  • mews
  • All magazines