-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Expose format (attribute) info for function declarations in Clang Index API #113754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ation when parsing headers: clang_Cursor_getFormatAttr Get FormatAttr at Function Declaration clang_FormatAttr_getType Get the format-type field for FormatAttr (i.e. printf, scanf) clang_FormatAttr_getFormatIdx Get the arg index of format template C-string clang_FormatAttr_getFirstArg Get the arg index of first arg for formatting params
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-clang Author: AR Visions (ar-visions) ChangesEnable format info for C function declarations in Clang Index API, so we may have this context
The purpose of this is to enable the silver compiler (LLVM IR C API) to know context information on imported C functions, so it may convert arguments with more context, and no manual tables describing this information. This leverages LLVM's FormatAttr data that is already present so Clang Index users may parse C more effectively Full diff: https://github.com/llvm/llvm-project/pull/113754.diff 4 Files Affected:
diff --git a/.gitignore b/.gitignore
index 0e7c6c79001338..61de1404c127d6 100644
--- a/.gitignore
+++ b/.gitignore
@@ -71,3 +71,5 @@ pythonenv*
/clang/utils/analyzer/projects/*/RefScanBuildResults
# automodapi puts generated documentation files here.
/lldb/docs/python_api/
+silver-debug
+silver-build
diff --git a/clang/include/clang-c/Index.h b/clang/include/clang-c/Index.h
index 0c5ac80772e2b9..d27c571b776405 100644
--- a/clang/include/clang-c/Index.h
+++ b/clang/include/clang-c/Index.h
@@ -3109,6 +3109,29 @@ CINDEX_LINKAGE int clang_getFieldDeclBitWidth(CXCursor C);
*/
CINDEX_LINKAGE int clang_Cursor_getNumArguments(CXCursor C);
+/**
+ * Retrieve FormatAttr on function declaration
+ */
+CINDEX_LINKAGE CXCursor clang_Cursor_getFormatAttr (CXCursor cur);
+
+/**
+ * Retrieve string name of the type of formatting (i.e. scanf or printf)
+ * used with clang_Cursor_getFormatAttr
+ */
+CINDEX_LINKAGE CXString clang_FormatAttr_getType (CXCursor C);
+
+/**
+ * Retrieve the arg location (0-based) of the template C-string
+ * used with clang_Cursor_getFormatAttr
+ */
+CINDEX_LINKAGE unsigned clang_FormatAttr_getFormatIdx(CXCursor C);
+
+/**
+ * Retrieve the arg location to the first format argument
+ * used with clang_Cursor_getFormatAttr
+ */
+CINDEX_LINKAGE unsigned clang_FormatAttr_getFirstArg (CXCursor C);
+
/**
* Retrieve the argument cursor of a function or method.
*
diff --git a/clang/tools/libclang/CXCursor.cpp b/clang/tools/libclang/CXCursor.cpp
index 562228cc334f3a..1ec0dc05123193 100644
--- a/clang/tools/libclang/CXCursor.cpp
+++ b/clang/tools/libclang/CXCursor.cpp
@@ -1330,6 +1330,32 @@ CXTranslationUnit clang_Cursor_getTranslationUnit(CXCursor cursor) {
return getCursorTU(cursor);
}
+CXCursor clang_Cursor_getFormatAttr(CXCursor cur) {
+ const Decl *decl = cxcursor::getCursorDecl(cur);
+ if (const FunctionDecl *fd = dyn_cast_or_null<FunctionDecl>(decl))
+ if (const FormatAttr *fa = fd->getAttr<FormatAttr>())
+ return cxcursor::MakeCXCursor(fa, decl, cxcursor::getCursorTU(cur)); // cursor with FormatAttr
+ return clang_getNullCursor();
+}
+
+CXString clang_FormatAttr_getType(CXCursor cur) {
+ const FormatAttr *fa = dyn_cast_or_null<FormatAttr>(cxcursor::getCursorAttr(cur));
+ if (!fa) return cxstring::createEmpty();
+ return cxstring::createDup(fa->getType()->getName());
+}
+
+unsigned clang_FormatAttr_getFormatIdx(CXCursor cur) {
+ const FormatAttr *fa = dyn_cast_or_null<FormatAttr>(cxcursor::getCursorAttr(cur));
+ if (!fa) return 0;
+ return fa->getFormatIdx();
+}
+
+unsigned clang_FormatAttr_getFirstArg(CXCursor cur) {
+ const FormatAttr *fa = dyn_cast_or_null<FormatAttr>(cxcursor::getCursorAttr(cur));
+ if (!fa) return 0;
+ return fa->getFirstArg();
+}
+
int clang_Cursor_getNumArguments(CXCursor C) {
if (clang_isDeclaration(C.kind)) {
const Decl *D = cxcursor::getCursorDecl(C);
diff --git a/clang/tools/libclang/libclang.map b/clang/tools/libclang/libclang.map
index 25d8ba57d32514..7737abe3288de9 100644
--- a/clang/tools/libclang/libclang.map
+++ b/clang/tools/libclang/libclang.map
@@ -59,6 +59,10 @@ LLVM_13 {
clang_Cursor_getMangling;
clang_Cursor_getModule;
clang_Cursor_getNumArguments;
+ clang_Cursor_getFormatAttr;
+ clang_FormatAttr_getType;
+ clang_FormatAttr_getFormatIdx;
+ clang_FormatAttr_getFirstArg;
clang_Cursor_getNumTemplateArguments;
clang_Cursor_getObjCDeclQualifiers;
clang_Cursor_getObjCManglings;
|
I should note that this patch is useful for providing security context to parsers that use formatting. Users that rely on Clang to provide info on the C functions REQUIRE this security validation context. A parser that does no processing on this but relies on user hard coding of formatter-enabled functions from C will simply have more boilerplate and more security issues in general, as (formatter) is a generic standard that anyone can implement in any interface. Please tag as security-related if we can. It clearly is for that purpose. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not that familiar with the C++ side of the bindings, so I would wait for approval from others.
clang_Cursor_getFormatAttr; | ||
clang_FormatAttr_getType; | ||
clang_FormatAttr_getFormatIdx; | ||
clang_FormatAttr_getFirstArg; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be inserted under the header for the next version to be released, i.e. LLVM_20
/** | ||
* Retrieve FormatAttr on function declaration | ||
*/ | ||
CINDEX_LINKAGE CXCursor clang_Cursor_getFormatAttr (CXCursor cur); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to align the indentation of the arguments here (same below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry it took us a while to notice this PR.
I think that documentation of new functions should describe what happens on unhappy code paths, too.
I also would like to see unit tests and a release note.
Even more, I'd like @AaronBallman to sign this off when he comes back (mid-January), because this is about the first time we add functions to CIndex API that are specific to a particular attribute. The approach seems fine to me, but CIndex is our stable API, so I prefer us to move slowly here.
@@ -71,3 +71,5 @@ pythonenv* | |||
/clang/utils/analyzer/projects/*/RefScanBuildResults | |||
# automodapi puts generated documentation files here. | |||
/lldb/docs/python_api/ | |||
silver-debug | |||
silver-build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are unrelated changes that should remain local to your development environment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your patience while I considered this. I'm not certain this is quite the correct design because it's very limited to just the format attribute. I would expect we'd rather want clang_Cursor_getNumArguments
to return the number of arguments for a CXCursor that represents an attribute and clang_Cursor_getArgument
to return attribute arguments. Have you considered such an approach?
Exposes 4 new clang CX functions for use when looking at header information on function declarations (context: CXCursor_FunctionDecl)
The development purpose of change was to enable the silver compiler (LLVM IR C API) to know context information on imported C functions, so it may convert arguments with more context, and no manual tables describing this information.
The change simply leverages LLVM's FormatAttr data that is already present so Clang Index users may parse C more effectively