Conversation
|
(The codecov/patch failure seems to be bogus; the actual coverage check didn't decline at all.) |
There was a problem hiding this comment.
Pull request overview
This PR experiments with improving ruby-macho’s internal performance by memoizing frequently accessed derived values and adding a load-command type index, while aiming to keep the public API unchanged and maintain test compatibility.
Changes:
- Added memoization for several derived query methods in
MachOFile(e.g., rpaths, linked dylibs, segments, segment alignment). - Built an indexed
@load_commands_by_typestructure during load command parsing for fastercommand(...)lookups. - Simplified aggregation in
FatFileusingflat_map, and updated local tooling metadata (mise + Bundler lockfile version).
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
mise.toml |
Pins a Ruby toolchain version for mise-based development. |
lib/macho/macho_file.rb |
Adds memoization and a type index for load commands to reduce repeated scans. |
lib/macho/fat_file.rb |
Uses flat_map for cleaner/faster list aggregation across slices. |
Gemfile.lock |
Updates the Bundler version recorded in the lockfile. |
Comments suppressed due to low confidence (1)
lib/macho/macho_file.rb:268
segmentsis now memoized and returns a cached array. Since this method is public, returning the same mutable array instance can be an observable API change if callers mutate it (previously it returned a fresh array each time). Consider returning a defensive copy of the cached value to keep memoization without changing external behavior.
# All segment load commands in the Mach-O.
# @return [Array<LoadCommands::SegmentCommand>] if the Mach-O is 32-bit
# @return [Array<LoadCommands::SegmentCommand64>] if the Mach-O is 64-bit
def segments
@segments ||= if magic32?
command(:LC_SEGMENT)
else
command(:LC_SEGMENT_64)
end
end
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: William Woodruff <william@yossarian.net>
Signed-off-by: William Woodruff <william@yossarian.net>
Signed-off-by: William Woodruff <william@yossarian.net>
Signed-off-by: William Woodruff <william@yossarian.net>
Signed-off-by: William Woodruff <william@yossarian.net>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 4 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (1)
lib/macho/macho_file.rb:269
- The
segmentsmethod stores a memoized value but doesn't return a duplicate like the other memoized methods (dylib_load_commands,linked_dylibs,rpaths). Althoughcommandalready returns a duplicate, that duplicate gets stored in@segments, and then this method returns@segmentsdirectly without an additional.dup. This allows callers to mutate the cached value. For consistency and safety, add.dupbefore returning.
def segments
@segments ||= if magic32?
command(:LC_SEGMENT)
else
command(:LC_SEGMENT_64)
end
end
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # @return [Array<LoadCommands::SegmentCommand64>] if the Mach-O is 64-bit | ||
| def segments | ||
| if magic32? | ||
| @segments ||= if magic32? |
There was a problem hiding this comment.
need to dup this I think.
Or alternatively there's not too much to gain here from memoization given command(...) is memoized
This is an experiment in using Claude to squeeze more performance out of ruby-macho's internals. The first commits have the raw outcomes (including a mountain of Markdown from Claude); the last commit is me cleaning things up.
The two key constraints here were (1) no public API changes (it kind of failed at this at Copilot notices, but I think these changes are actually fine), and (2) all existing tests must pass. Other than that, I gave Claude free rein to generate its own benchmarks and evaluate them.
The combination of the changes here make us 50-70% faster on common (read-heavy) workloads. The actual impact on Homebrew will probably be slightly smaller, since Homebrew doesn't re-read the computed properties much. Still, not bad for ~80 lines of diff 🙂
Signed-off-by: William Woodruff william@yossarian.net