-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Add a more efficient implementation for String(::ReinterpretArray) #58438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
base/strings/string.jl
Outdated
function String(v::ReinterpretArray{UInt8, 1, S, Vector{S}, IsReshaped}) where {S, IsReshaped} | ||
len = length(v) | ||
len == 0 && return "" | ||
ref = v.parent.ref | ||
if ref.ptr_or_offset == ref.mem.ptr | ||
str = ccall(:jl_genericmemory_to_string, Ref{String}, (Any, Int), ref.mem, len) | ||
else | ||
str = ccall(:jl_pchar_to_string, Ref{String}, (Ptr{S}, Int), ref, len) | ||
end | ||
# optimized empty!(v.parent); sizehint!(v.parent, 0) calls | ||
setfield!(v.parent, :size, (0,)) | ||
setfield!(v.parent, :ref, memoryref(Memory{S}())) | ||
return str | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is invalid, because String(::AbstractVector)
is documented to copy, unless the vector is a Vector
. So you'd need to make a stringvector, then copy to it.
function String(v::ReinterpretArray{UInt8, 1, S, Vector{S}, IsReshaped}) where {S, IsReshaped} | |
len = length(v) | |
len == 0 && return "" | |
ref = v.parent.ref | |
if ref.ptr_or_offset == ref.mem.ptr | |
str = ccall(:jl_genericmemory_to_string, Ref{String}, (Any, Int), ref.mem, len) | |
else | |
str = ccall(:jl_pchar_to_string, Ref{String}, (Ptr{S}, Int), ref, len) | |
end | |
# optimized empty!(v.parent); sizehint!(v.parent, 0) calls | |
setfield!(v.parent, :size, (0,)) | |
setfield!(v.parent, :ref, memoryref(Memory{S}())) | |
return str | |
end | |
function String(v::ReinterpretArray{UInt8, 1, S, Vector{S}, IsReshaped}) where {S, IsReshaped} | |
len = length(v) | |
iszero(len) && return "" | |
par = parent(v) | |
mem = StringMemory(len % UInt) | |
GC.@preserve mem par begin | |
unsafe_copyto!(pointer(mem), Ptr{UInt8}(pointer(par)), len % UInt) | |
end | |
return unsafe_takestring(mem) | |
end |
Another issue is whether you can make the signature broader - what about reinterprets of Memory
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented this fix, but using jl_pchar_to_string
(which always copies) to remove the extra allocation from StringMemory
. I also implemented it for reinterprets of Memory
.
This has made me realize that the docstring for String
could be clearer. Perhaps splitting up the docstring for Vector{UInt8}
and AbstractVector{UInt8}
would be good. There's also a few missing entries that might be good to have (like the one for AbstractVector{Char}
. Do you have any thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! Better to do that in a separate PR, in my opinion.
Are the new methods covered by tests already? |
Doesn't seems so - looks like I was too quick with the merge me label. |
Not sure if these tests are sufficient or not. |
I added a new method
String(::ReinterpretArray{UInt8, 1, S, Vector{S}, IsReshaped})
, which is a copy ofString(::Vector{UInt8})
but usingv.parent
instead ofv
. I'm not sure if the same can be done for more genericReinterpretArray
s or for otherAbstractVector
s, I'd like some input on that if possible.Previous behavior:
New behavior: