Skip to content

Canonicalize pin paths #4274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

dthaler
Copy link
Collaborator

@dthaler dthaler commented Mar 10, 2025

Description

  • Pinning related APIs should treat pin paths the same as filesystem paths and canonicalize them accordingly
  • Fix documentation error with ebpf_get_next_pinned_object_path() where the start_path cannot be null.
  • Added a test. Note that in doing so, I found and filed issue pinned_path_count is always 0 #4273 and added a TODO for that issue in the test where it should be tested.

Fixes #4239

Note: on Linux, pin paths are case sensitive, and so any programs ported from Linux may assume case sensitivity.
On Windows, it varies by filesystem, where Windows itself supports both case sensitive filesystems and case insensitive filesystems. The latter is the normal filesystem on Windows, but if we later support a BPF filesystem, we can make it be case sensitive, for ease of porting from Linux. As such, this PR uses case sensitive pin paths. This is not a change for ebpf-for-windows, which already had case-sensitive pin paths.

Testing

This PR contains associated tests.

Documentation

This PR includes a design doc added to the docs directory.

Installation

No impact.

@dthaler dthaler requested a review from lmb March 10, 2025 19:36
@dthaler dthaler changed the title Canonicalize paths Canonicalize pin paths Mar 10, 2025
Alan-Jowett
Alan-Jowett previously approved these changes Mar 11, 2025
@dthaler dthaler force-pushed the canonicalize-paths branch from 888ff28 to 8717508 Compare March 14, 2025 15:55
@dthaler dthaler force-pushed the canonicalize-paths branch 5 times, most recently from 81b2d93 to a20d9c8 Compare March 18, 2025 15:54
docs/PinPaths.md Outdated
* Pins can be enumerated or removed via normal filesystem APIs/utilities
* Pin paths are case-sensitive
* The path component separator is a forward slash ('/')
* Pin paths begin with "/sys/fs/bpf/" and relative paths like "mymap" are relative to that prefix.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily, this is just a convention that systemd and the popular user space libraries follow. It is possible to mount a separate BPF fs at say /var/run/myapp/bpffs and then pin into that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated text

docs/PinPaths.md Outdated
the driver implements it as such.

As for portability, canonicalization could ensure that Linux style paths like
"/sys/fs/bpf/my/pin/path" would be canonicalized to "BPF:\my\pin\path" allowing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My gut feeling would be to not remove the sys/fs/bpf prefix. If you do want to remove it, what should happen to paths that do not have it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to not remove it.

docs/PinPaths.md Outdated

As for portability, canonicalization could ensure that Linux style paths like
"/sys/fs/bpf/my/pin/path" would be canonicalized to "BPF:\my\pin\path" allowing
portability in that respect. But querying the pin path on an object
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete sentence?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


### UNC (UNC path)

Example: \\BPF\my\pin\path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also \\?\ which would disable all path parsing and would forward the path to the filesystem verbatim. And \\.\ which points into the object namespace, but I don't know whether we could hook that somehow?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a brief note.


### Virtual (Windows file system path with another virtual drive for BPF)

Example: BPF:\my\pin\path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being a windows noob, is it possible to have multi letter drives? Never seen one of those.

Also, would this drive show up in the explorer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a drive, but a virtual device (aka DOS device name). There are a bunch of these in windows, like CON:, PRN:, NUL:, COM1:, etc. And drivers can add new ones.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated wording to explain.

@dthaler dthaler requested a review from lmb March 31, 2025 15:24
Signed-off-by: Dave Thaler <dthaler1968@gmail.com>
dthaler added 3 commits March 31, 2025 08:27
Signed-off-by: Dave Thaler <dthaler1968@gmail.com>
Signed-off-by: Dave Thaler <dthaler1968@gmail.com>
Signed-off-by: Dave Thaler <dthaler1968@gmail.com>
@dthaler dthaler force-pushed the canonicalize-paths branch from bb70b64 to b5fc392 Compare March 31, 2025 15:28
Copy link
Collaborator

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still want to run ebpf-go unit tests against this, I'll get back to you!

native_maps[i].pin_path.value =
ebpf_allocate_with_tag(prefix_length + name_length + 1, EBPF_POOL_TAG_NATIVE);
size_t length = strlen(canonical_path);
native_maps[i].pin_path.value = ebpf_allocate_with_tag(length, EBPF_POOL_TAG_NATIVE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off by one here for NUL?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NUL isn't needed here since the length is stored. This isn't a change from before, which also didn't include space for a NUL. The +1 before was for the "/", as can be seen in old lines 1042-1045.

strcpy_s(output, output_size, "BPF:\\");

// Add the BPF: prefix if not already present.
if (_strnicmp(input, "BPF:\\", PREFIX_SIZE) == 0 || _strnicmp(input, "BPF:/", PREFIX_SIZE) == 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: would this be easier after transforming slashes?

Copy link
Collaborator Author

@dthaler dthaler Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, a mutable buffer is used to transform slashes, and this is the code that copies the string into the mutable buffer.

ebpf_result_t
ebpf_canonicalize_path(_Out_writes_(output_size) char* output, size_t output_size, _In_z_ const char* input)
{
const int PREFIX_SIZE = 5; // Length of "BPF:\\".
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that 4?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's 5. BPF is 3, then a colon, then a backslash.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated code to be easier to follow (hopefully), and define it as 4 instead of 5 based on the new code.

ebpf_canonicalize_path(_Out_writes_(output_size) char* output, size_t output_size, _In_z_ const char* input)
{
const int PREFIX_SIZE = 5; // Length of "BPF:\\".
strcpy_s(output, output_size, "BPF:\\");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check the return code from strcpy_s? This fails if the destination buffer is too small.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

memmove(output + i + 1, next, strlen(next) + 1);
} else if (strncmp(output + i, "\\..\\", 4) == 0) {
int previous_index = i - 1;
for (; previous_index >= 4; previous_index--) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 4 as the boundary value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"BPF:" is length 4.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to be easier to follow (I hope).

Signed-off-by: Dave Thaler <dthaler1968@gmail.com>
@dthaler dthaler requested review from Alan-Jowett and lmb April 2, 2025 00:06
@lmb
Copy link
Collaborator

lmb commented Apr 2, 2025

Ran the ebpf-go testsuite against this PR, and it turns out that ebpf_get_next_pinned_object_path needs canonicalization as well.

@dthaler
Copy link
Collaborator Author

dthaler commented Apr 2, 2025

Ran the ebpf-go testsuite against this PR, and it turns out that ebpf_get_next_pinned_object_path needs canonicalization as well.

@lmb What is the specific test case(s) you are using with this API?

@lmb
Copy link
Collaborator

lmb commented Apr 14, 2025

Test case: https://github.com/cilium/ebpf/pull/1736/files#diff-5daa97a3a95cd1516d793eeff2ceb72f8d8fe926f38e0ec584e6ae21f7e5e4c8
Function being tested: https://github.com/cilium/ebpf/pull/1736/files#diff-d3c75b3feceb3d403da6f23bcb72e3374767dbbab89f80a014a0b0c5fc0bb417

P.S. Actually I think canonicalization doesn't work / help. The reason is the HasPrefix call which immediately bails out when using a path like "/sys/fs/bpf/foo" as input because the output of the function includes the BPF: drive letter, or has its slashes normalised. We need the HasPrefix check to determine when to end iteration. This could be worked around by moving that check into ebpf_get_next_pinned_object_path, and operate on normalized paths there. Doesn't solve the problem outright of course.

@lmb
Copy link
Collaborator

lmb commented Apr 14, 2025

Assuming the pin table contains:

  • BPF:\foo\bar
  • BPF:\foo\baz
  • BPF:\beep

Here is what the library wants to do:

root cursor next path continue?
/foo /foo BPF:\foo\bar yes
/foo /foo/baz BPF:\bloop no
/ / BPF:\foo\bar yes

At the moment this doesn't work, since the next path is always returned in its canonicalized form, but the root which we are comparing against isn't canonicalized.

Possible work arounds:

  • Require that the library passes a canonicalized path to the function.
  • Pass root to the function and do the continue check in the runtime, on the canonicalized version.

@dthaler
Copy link
Collaborator Author

dthaler commented Apr 19, 2025

Assuming the pin table contains:

  • BPF:\foo\bar
  • BPF:\foo\baz
  • BPF:\beep

Here is what the library wants to do:

root cursor next path continue?
/foo /foo BPF:\foo\bar yes
/foo /foo/baz BPF:\bloop no
/ / BPF:\foo\bar yes
At the moment this doesn't work, since the next path is always returned in its canonicalized form, but the root which we are comparing against isn't canonicalized.

Possible work arounds:

  • Require that the library passes a canonicalized path to the function.
  • Pass root to the function and do the continue check in the runtime, on the canonicalized version.

@lmb ebpfapi has:

/**
     * @brief Retrieve the next pinned path of an eBPF object.
     *
     * @param[in] start_path Path to look for an entry greater than.
     * @param[out] next_path Returns the next path in lexicographical order, if one exists.
     * @param[in] next_path_len Length of the next path buffer.
     * @param[in, out] type On input, the type of object to retrieve or EBPF_OBJECT_UNKNOWN.
     *                      On output, the type of the object.
     *
     * @retval EBPF_SUCCESS The operation was successful.
     * @retval other An error occurred.
     */
    _Must_inspect_result_ ebpf_result_t
    ebpf_get_next_pinned_object_path(
        _In_z_ const char* start_path,
        _Out_writes_z_(next_path_len) char* next_path,
        size_t next_path_len,
        _Inout_ ebpf_object_type_t* type) EBPF_NO_EXCEPT;

In your table, you have "next path" as a column header. Is that what you're expecting to get back? How could you get back "BPF:\bloop" when that isn't in the pin path table?

What does "root" mean? does that mean the library wants to enumerate only the paths under that root?

What does "cursor" mean? does that mean it's the value you want to pass to start_path to look for an entry greater than?

If we export a function to get the canonical path from the non-canonical path, would that solve the problem?

And how is that different than on linux if the root is "/bar/../foo", where that wouldn't be a substring of the next path returned by an enumeration? Don't you already have to canonicalize the path to compare against in your library?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pinning related APIs should treat names as paths and canonicalize them
4 participants