Skip to content

Expose format (attribute) info for function declarations in Clang Index API #113754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ar-visions
Copy link

@ar-visions ar-visions commented Oct 26, 2024

Exposes 4 new clang CX functions for use when looking at header information on function declarations (context: CXCursor_FunctionDecl)

clang_Cursor_getFormatAttr
Get FormatAttr at Function Declaration
    If no FormatAttr available, returns null cursor

When cursor is not null (call clang_Cursor_isNull), we may call the following:

clang_FormatAttr_getType
Get the format-type field for FormatAttr (i.e. printf, scanf)

clang_FormatAttr_getFormatIdx
Get the arg index of format template C-string

clang_FormatAttr_getFirstArg
Get the arg index of first arg for formatting params

The development purpose of change was to enable the silver compiler (LLVM IR C API) to know context information on imported C functions, so it may convert arguments with more context, and no manual tables describing this information.

The change simply leverages LLVM's FormatAttr data that is already present so Clang Index users may parse C more effectively

…ation when parsing headers:

    clang_Cursor_getFormatAttr
	Get FormatAttr at Function Declaration

    clang_FormatAttr_getType
	Get the format-type field for FormatAttr (i.e. printf, scanf)

    clang_FormatAttr_getFormatIdx
	Get the arg index of format template C-string

    clang_FormatAttr_getFirstArg
	Get the arg index of first arg for formatting params
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:as-a-library libclang and C++ API labels Oct 26, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 26, 2024

@llvm/pr-subscribers-clang

Author: AR Visions (ar-visions)

Changes

Enable format info for C function declarations in Clang Index API, so we may have this context

clang_Cursor_getFormatAttr
Get FormatAttr at Function Declaration

clang_FormatAttr_getType
Get the format-type field for FormatAttr (i.e. printf, scanf)

clang_FormatAttr_getFormatIdx
Get the arg index of format template C-string

clang_FormatAttr_getFirstArg
Get the arg index of first arg for formatting params

The purpose of this is to enable the silver compiler (LLVM IR C API) to know context information on imported C functions, so it may convert arguments with more context, and no manual tables describing this information. This leverages LLVM's FormatAttr data that is already present so Clang Index users may parse C more effectively


Full diff: https://github.com/llvm/llvm-project/pull/113754.diff

4 Files Affected:

  • (modified) .gitignore (+2)
  • (modified) clang/include/clang-c/Index.h (+23)
  • (modified) clang/tools/libclang/CXCursor.cpp (+26)
  • (modified) clang/tools/libclang/libclang.map (+4)
diff --git a/.gitignore b/.gitignore
index 0e7c6c79001338..61de1404c127d6 100644
--- a/.gitignore
+++ b/.gitignore
@@ -71,3 +71,5 @@ pythonenv*
 /clang/utils/analyzer/projects/*/RefScanBuildResults
 # automodapi puts generated documentation files here.
 /lldb/docs/python_api/
+silver-debug
+silver-build
diff --git a/clang/include/clang-c/Index.h b/clang/include/clang-c/Index.h
index 0c5ac80772e2b9..d27c571b776405 100644
--- a/clang/include/clang-c/Index.h
+++ b/clang/include/clang-c/Index.h
@@ -3109,6 +3109,29 @@ CINDEX_LINKAGE int clang_getFieldDeclBitWidth(CXCursor C);
  */
 CINDEX_LINKAGE int clang_Cursor_getNumArguments(CXCursor C);
 
+/**
+ * Retrieve FormatAttr on function declaration
+ */
+CINDEX_LINKAGE CXCursor clang_Cursor_getFormatAttr   (CXCursor cur);
+
+/**
+ * Retrieve string name of the type of formatting (i.e. scanf or printf)
+ * used with clang_Cursor_getFormatAttr
+ */
+CINDEX_LINKAGE CXString clang_FormatAttr_getType     (CXCursor C);
+
+/**
+ * Retrieve the arg location (0-based) of the template C-string
+ * used with clang_Cursor_getFormatAttr
+ */
+CINDEX_LINKAGE unsigned clang_FormatAttr_getFormatIdx(CXCursor C);
+
+/**
+ * Retrieve the arg location to the first format argument
+ * used with clang_Cursor_getFormatAttr
+ */
+CINDEX_LINKAGE unsigned clang_FormatAttr_getFirstArg (CXCursor C);
+
 /**
  * Retrieve the argument cursor of a function or method.
  *
diff --git a/clang/tools/libclang/CXCursor.cpp b/clang/tools/libclang/CXCursor.cpp
index 562228cc334f3a..1ec0dc05123193 100644
--- a/clang/tools/libclang/CXCursor.cpp
+++ b/clang/tools/libclang/CXCursor.cpp
@@ -1330,6 +1330,32 @@ CXTranslationUnit clang_Cursor_getTranslationUnit(CXCursor cursor) {
   return getCursorTU(cursor);
 }
 
+CXCursor clang_Cursor_getFormatAttr(CXCursor cur) {
+  const Decl *decl = cxcursor::getCursorDecl(cur);
+  if (const FunctionDecl *fd = dyn_cast_or_null<FunctionDecl>(decl))
+    if (const FormatAttr *fa = fd->getAttr<FormatAttr>())
+      return cxcursor::MakeCXCursor(fa, decl, cxcursor::getCursorTU(cur)); // cursor with FormatAttr
+  return clang_getNullCursor();
+}
+
+CXString clang_FormatAttr_getType(CXCursor cur) {
+  const FormatAttr *fa = dyn_cast_or_null<FormatAttr>(cxcursor::getCursorAttr(cur));
+  if (!fa) return cxstring::createEmpty();
+  return cxstring::createDup(fa->getType()->getName());
+}
+
+unsigned clang_FormatAttr_getFormatIdx(CXCursor cur) {
+  const FormatAttr *fa = dyn_cast_or_null<FormatAttr>(cxcursor::getCursorAttr(cur));
+  if (!fa) return 0;
+  return fa->getFormatIdx();
+}
+
+unsigned clang_FormatAttr_getFirstArg(CXCursor cur) {
+  const FormatAttr *fa = dyn_cast_or_null<FormatAttr>(cxcursor::getCursorAttr(cur));
+  if (!fa) return 0;
+  return fa->getFirstArg();
+}
+
 int clang_Cursor_getNumArguments(CXCursor C) {
   if (clang_isDeclaration(C.kind)) {
     const Decl *D = cxcursor::getCursorDecl(C);
diff --git a/clang/tools/libclang/libclang.map b/clang/tools/libclang/libclang.map
index 25d8ba57d32514..7737abe3288de9 100644
--- a/clang/tools/libclang/libclang.map
+++ b/clang/tools/libclang/libclang.map
@@ -59,6 +59,10 @@ LLVM_13 {
     clang_Cursor_getMangling;
     clang_Cursor_getModule;
     clang_Cursor_getNumArguments;
+    clang_Cursor_getFormatAttr;
+    clang_FormatAttr_getType;
+    clang_FormatAttr_getFormatIdx;
+    clang_FormatAttr_getFirstArg;
     clang_Cursor_getNumTemplateArguments;
     clang_Cursor_getObjCDeclQualifiers;
     clang_Cursor_getObjCManglings;

@ar-visions ar-visions changed the title Enable format info for C function declarations in Clang Index API, so we may have this context Expose C format (attribute) info for C function declarations in Clang Index API Oct 26, 2024
@ar-visions ar-visions changed the title Expose C format (attribute) info for C function declarations in Clang Index API Expose C format (attribute) info for function declarations in Clang Index API Oct 26, 2024
@ar-visions ar-visions changed the title Expose C format (attribute) info for function declarations in Clang Index API Expose format (attribute) info for function declarations in Clang Index API Oct 26, 2024
@ar-visions
Copy link
Author

ar-visions commented Nov 8, 2024

I should note that this patch is useful for providing security context to parsers that use formatting. Users that rely on Clang to provide info on the C functions REQUIRE this security validation context. A parser that does no processing on this but relies on user hard coding of formatter-enabled functions from C will simply have more boilerplate and more security issues in general, as (formatter) is a generic standard that anyone can implement in any interface.

Please tag as security-related if we can. It clearly is for that purpose.

Copy link
Contributor

@DeinAlptraum DeinAlptraum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not that familiar with the C++ side of the bindings, so I would wait for approval from others.

Comment on lines +62 to +65
clang_Cursor_getFormatAttr;
clang_FormatAttr_getType;
clang_FormatAttr_getFormatIdx;
clang_FormatAttr_getFirstArg;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be inserted under the header for the next version to be released, i.e. LLVM_20

/**
* Retrieve FormatAttr on function declaration
*/
CINDEX_LINKAGE CXCursor clang_Cursor_getFormatAttr (CXCursor cur);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to align the indentation of the arguments here (same below)

Copy link
Contributor

@Endilll Endilll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry it took us a while to notice this PR.

I think that documentation of new functions should describe what happens on unhappy code paths, too.

I also would like to see unit tests and a release note.

Even more, I'd like @AaronBallman to sign this off when he comes back (mid-January), because this is about the first time we add functions to CIndex API that are specific to a particular attribute. The approach seems fine to me, but CIndex is our stable API, so I prefer us to move slowly here.

@@ -71,3 +71,5 @@ pythonenv*
/clang/utils/analyzer/projects/*/RefScanBuildResults
# automodapi puts generated documentation files here.
/lldb/docs/python_api/
silver-debug
silver-build
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are unrelated changes that should remain local to your development environment

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patience while I considered this. I'm not certain this is quite the correct design because it's very limited to just the format attribute. I would expect we'd rather want clang_Cursor_getNumArguments to return the number of arguments for a CXCursor that represents an attribute and clang_Cursor_getArgument to return attribute arguments. Have you considered such an approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:as-a-library libclang and C++ API clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants