-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Diagnose potential size confusion with VLA params #129772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Diagnose potential size confusion with VLA params #129772
Conversation
With Clang starting to support bounds safety checking annotations, we've begun to dabble in late parsing of function parameters so that an earlier parameter can refer to a later parameter by name. However, one situation where this can cause a problem is if the earlier parameter is actually referencing a variable from an outer scope. To help identify those situations, this introduces a new diagnostic -Wvla-potential-size-confusion (grouped under -Wvla). This diagnoses code like: int n; void func(int array[n], int n); to let the user know that the 'n' in 'array[n]' references the file scope variable and not the parameter.
@llvm/pr-subscribers-clang Author: Aaron Ballman (AaronBallman) ChangesWith Clang starting to support bounds safety checking annotations, we've begun to dabble in late parsing of function parameters so that an earlier parameter can refer to a later parameter by name. However, one situation where this can cause a problem is if the earlier parameter is actually referencing a variable from an outer scope. To help identify those situations, this introduces a new diagnostic -Wvla-potential-size-confusion (grouped under -Wvla). This diagnoses code like: int n; to let the user know that the 'n' in 'array[n]' references the file scope variable and not the parameter. Full diff: https://github.com/llvm/llvm-project/pull/129772.diff 5 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 97b8e32f03b57..5417fd65dd893 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -222,6 +222,11 @@ Improvements to Clang's diagnostics
- Improve the diagnostics for shadows template parameter to report correct location (#GH129060).
+- Added the ``-Wvla-potential-size-confusion`` diagnostic, which is grouped
+ under ``-Wvla`` to diagnose when a variably-modified type in a function
+ parameter list is using a variable from an outer scope as opposed to a
+ variable declared later in the parameter list.
+
Improvements to Clang's time-trace
----------------------------------
diff --git a/clang/include/clang/Basic/DiagnosticGroups.td b/clang/include/clang/Basic/DiagnosticGroups.td
index fac80fb4009aa..1992c6a424e9e 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -936,7 +936,8 @@ def VexingParse : DiagGroup<"vexing-parse">;
def VLAUseStaticAssert : DiagGroup<"vla-extension-static-assert">;
def VLACxxExtension : DiagGroup<"vla-cxx-extension", [VLAUseStaticAssert]>;
def VLAExtension : DiagGroup<"vla-extension", [VLACxxExtension]>;
-def VLA : DiagGroup<"vla", [VLAExtension]>;
+def VLASizeConfusion : DiagGroup<"vla-potential-size-confusion">;
+def VLA : DiagGroup<"vla", [VLAExtension, VLASizeConfusion]>;
def VolatileRegisterVar : DiagGroup<"volatile-register-var">;
def Visibility : DiagGroup<"visibility">;
def ZeroLengthArray : DiagGroup<"zero-length-array">;
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index d89648a8a2e83..616986cdf5215 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -170,6 +170,13 @@ def err_vla_in_coroutine_unsupported : Error<
"variable length arrays in a coroutine are not supported">;
def note_vla_unsupported : Note<
"variable length arrays are not supported for the current target">;
+def warn_vla_size_expr_shadow : Warning<
+ "variable length array size expression refers to declaration from an outer "
+ "scope">, InGroup<VLASizeConfusion>;
+def note_vla_size_expr_shadow_param : Note<
+ "does not refer to this declaration">;
+def note_vla_size_expr_shadow_actual : Note<
+ "refers to this declaration instead">;
// C99 variably modified types
def err_variably_modified_template_arg : Error<
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index fe313c62ff846..ad0f8d9101a30 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -26,6 +26,7 @@
#include "clang/AST/MangleNumberingContext.h"
#include "clang/AST/NonTrivialTypeVisitor.h"
#include "clang/AST/Randstruct.h"
+#include "clang/AST/RecursiveASTVisitor.h"
#include "clang/AST/StmtCXX.h"
#include "clang/AST/Type.h"
#include "clang/Basic/Builtins.h"
@@ -10333,6 +10334,75 @@ Sema::ActOnFunctionDeclarator(Scope *S, Declarator &D, DeclContext *DC,
}
}
+ // Loop over the parameters to see if any of the size expressions contains
+ // a DeclRefExpr which refers to a variable from an outer scope that is
+ // also named later in the parameter list.
+ // e.g., int n; void func(int array[n], int n);
+ SmallVector<const DeclRefExpr *, 2> DRESizeExprs;
+ llvm::for_each(Params, [&](const ParmVarDecl *Param) {
+ // If we have any size expressions we need to check against, check them
+ // now.
+ for (const auto *DRE : DRESizeExprs) {
+ // Check to see if this parameter has the same name as one of the
+ // DeclRefExprs we wanted to test against. If so, then we found a
+ // situation where an earlier parameter refers to the name of a later
+ // parameter, which is (currently) only valid if there's a variable
+ // from an outer scope with the same name.
+ if (const auto *SizeExprND = dyn_cast<NamedDecl>(DRE->getDecl())) {
+ if (SizeExprND->getIdentifier() == Param->getIdentifier()) {
+ // Diagnose the DeclRefExpr from the parameter with the size
+ // expression.
+ Diag(DRE->getLocation(), diag::warn_vla_size_expr_shadow);
+ // Note the parameter that a user could be confused into thinking
+ // they're referring to.
+ Diag(Param->getLocation(), diag::note_vla_size_expr_shadow_param);
+ // Note the DeclRefExpr that's actually being used.
+ Diag(DRE->getDecl()->getLocation(),
+ diag::note_vla_size_expr_shadow_actual);
+
+ }
+ }
+ }
+
+ // To check whether its size expression is a simple DeclRefExpr, we first
+ // have to walk through pointers or references, but array types always
+ // decay to a pointer, so skip if this is a DecayedType.
+ QualType QT = Param->getType();
+ while (!isa<DecayedType>(QT.getTypePtr()) &&
+ (QT->isPointerType() || QT->isReferenceType()))
+ QT = QT->getPointeeType();
+
+ // An array type is always decayed to a pointer, so we need to get the
+ // original type in that case.
+ if (const auto *DT = QT->getAs<DecayedType>())
+ QT = DT->getOriginalType();
+
+ // Now we can see if it's a VLA type with a size expression.
+ // FIXME: it would be nice to handle constant-sized arrays as well,
+ // e.g., constexpr int n = 12; void foo(int array[n], int n);
+ // however, the constant expression is replaced by its value at the time
+ // we form the type, so we've lost that information here.
+ if (!QT->hasSizedVLAType())
+ return;
+
+ const VariableArrayType *VAT = getASTContext().getAsVariableArrayType(QT);
+ if (!VAT)
+ return;
+
+ class DeclRefFinder : public RecursiveASTVisitor<DeclRefFinder> {
+ SmallVectorImpl<const DeclRefExpr *> &Found;
+
+ public:
+ DeclRefFinder(SmallVectorImpl<const DeclRefExpr *> &Found)
+ : Found(Found) {}
+ bool VisitDeclRefExpr(const DeclRefExpr *DRE) {
+ Found.push_back(DRE);
+ return true;
+ }
+ } Finder(DRESizeExprs);
+ Finder.TraverseStmt(VAT->getSizeExpr());
+ });
+
if (!getLangOpts().CPlusPlus) {
// In C, find all the tag declarations from the prototype and move them
// into the function DeclContext. Remove them from the surrounding tag
diff --git a/clang/test/Sema/vla-potential-size-confusion.c b/clang/test/Sema/vla-potential-size-confusion.c
new file mode 100644
index 0000000000000..de9560483e5f3
--- /dev/null
+++ b/clang/test/Sema/vla-potential-size-confusion.c
@@ -0,0 +1,68 @@
+// RUN: %clang_cc1 %s -std=c23 -verify=expected,c -fsyntax-only
+// RUN: %clang_cc1 %s -std=c23 -verify=good -fsyntax-only -Wno-vla
+// RUN: %clang_cc1 -x c++ %s -verify -fsyntax-only
+// RUN: %clang_cc1 -DCARET -fsyntax-only -std=c23 -fno-diagnostics-show-line-numbers -fcaret-diagnostics-max-lines=1 %s 2>&1 | FileCheck %s -strict-whitespace
+
+// good-no-diagnostics
+
+int n, m; // #decl
+int size(int);
+
+void foo(int vla[n], int n); // expected-warning {{variable length array size expression refers to declaration from an outer scope}} \
+ expected-note {{does not refer to this declaration}} \
+ expected-note@#decl {{refers to this declaration instead}}
+
+void bar(int (*vla)[n], int n); // expected-warning {{variable length array size expression refers to declaration from an outer scope}} \
+ expected-note {{does not refer to this declaration}} \
+ expected-note@#decl {{refers to this declaration instead}}
+
+void baz(int n, int vla[n]); // no diagnostic expected
+
+void quux(int vla[n + 12], int n); // expected-warning {{variable length array size expression refers to declaration from an outer scope}} \
+ expected-note {{does not refer to this declaration}} \
+ expected-note@#decl {{refers to this declaration instead}}
+
+void quibble(int vla[size(n)], int n); // expected-warning {{variable length array size expression refers to declaration from an outer scope}} \
+ expected-note {{does not refer to this declaration}} \
+ expected-note@#decl {{refers to this declaration instead}}
+
+void quobble(int vla[n + m], int n, int m); // expected-warning 2 {{variable length array size expression refers to declaration from an outer scope}} \
+ expected-note 2 {{does not refer to this declaration}} \
+ expected-note@#decl 2 {{refers to this declaration instead}}
+
+// For const int, we still treat the function as having a variably-modified
+// type, but only in C.
+const int x = 12; // #other-decl
+void quorble(int vla[x], int x); // c-warning {{variable length array size expression refers to declaration from an outer scope}} \
+ c-note {{does not refer to this declaration}} \
+ c-note@#other-decl {{refers to this declaration instead}}
+
+// For constexpr int, the function has a constant array type. It would be nice
+// to diagnose this case as well, but the type system replaces the expression
+// with the constant value, and so the information about the name of the
+// variable used in the size expression is lost.
+constexpr int y = 12;
+void quuble(int vla[y], int y); // no diagnostic expected
+
+#ifdef __cplusplus
+struct S {
+ static int v; // #mem-var
+
+ void member_function(int vla[v], int v); // expected-warning {{variable length array size expression refers to declaration from an outer scope}} \
+ expected-note {{does not refer to this declaration}} \
+ expected-note@#mem-var {{refers to this declaration instead}}
+};
+#endif
+
+#ifdef CARET
+// Test that caret locations make sense.
+int w;
+void quable(int vla[w], int w);
+
+// CHECK: void quable(int vla[w], int w);
+// CHECK: ^
+// CHECK: void quable(int vla[w], int w);
+// CHECK: ^
+// CHECK: int w;
+// CHECK: ^
+#endif
|
CC @uecker who may also have opinions on the value of this diagnostic. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
Future improvements that would be nice to have are:
|
This warning may be useful for people who are confused about C's scoping and name lookup rules, although I would argue that you create a lot of this confusing by insisting to add the late parsing semantics that are incoherent with how C works. In C++ (where the rules are already a mess because you have different lookup rules in member functions), GCC has an error for the case that is UB. You may want to fix this first. Another problem to think about is that people will want to pass the size via a macro argument for the structure case. In this case you should certainly suppress the warning. |
Also: Are you planning to add warning also for |
expected-note@#decl 2 {{refers to this declaration instead}} | ||
|
||
// For const int, we still treat the function as having a variably-modified | ||
// type, but only in C. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it is implied but maybe it is worth saying in C++ this is usable in a constant expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that there is proposal in flight for C that might change this.
|
||
#ifdef __cplusplus | ||
struct S { | ||
static int v; // #mem-var |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also have the constant expression case for this example like we do for the non-member function case above? More tests the merrier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this diagnostics! I just filed an issue to extend the diagnostics for cases you mentioned and also for bounds annotations: #129783
I may try to back port in some downstream version and extend for bounds annotations to see how many locations it fire.
And a final comment. While the main motivating use case for you may be VLAs in parameter declarations (although as heavy user of those, I never felt the need for such a warning), it would seem the warning would be relevant in cases where the arrays are not actually VLAs. In general, it seems a warning about shadowing a visible identifier somewhere and just having it used with the old definition. So I wonder whether it should be renamed? |
if (!VAT) | ||
return; | ||
|
||
class DeclRefFinder : public RecursiveASTVisitor<DeclRefFinder> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to use the dynamic ast visitor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this should probably just inherit from ConstDynamicRecursiveASTVisitor
instead.
(That also reminds me that I need to get back to that because we still have some unmigrated visitors left...)
@@ -10333,6 +10334,74 @@ Sema::ActOnFunctionDeclarator(Scope *S, Declarator &D, DeclContext *DC, | |||
} | |||
} | |||
|
|||
// Loop over the parameters to see if any of the size expressions contains |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we
- Put all of that in a separate function
- Maybe not run it in modes VLAs are not supported
if (const auto *SizeExprND = dyn_cast<NamedDecl>(DRE->getDecl())) { | ||
if (SizeExprND->getIdentifier() == Param->getIdentifier()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (const auto *SizeExprND = dyn_cast<NamedDecl>(DRE->getDecl())) { | |
if (SizeExprND->getIdentifier() == Param->getIdentifier()) { | |
if (const auto *SizeExprND = dyn_cast<NamedDecl>(DRE->getDecl()); SizeExprND && SizeExprND->getIdentifier() == Param->getIdentifier()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In C++, we probably don't want to warn on int foo[::a], int a
, so we should look at nested name specifiers
QualType QT = Param->getType(); | ||
while (!isa<DecayedType>(QT.getTypePtr()) && | ||
(QT->isPointerType() || QT->isReferenceType())) | ||
QT = QT->getPointeeType(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't already a way to do that?
With Clang starting to support bounds safety checking annotations, we've begun to dabble in late parsing of function parameters so that an earlier parameter can refer to a later parameter by name. However, one situation where this can cause a problem is if the earlier parameter is actually referencing a variable from an outer scope.
To help identify those situations, this introduces a new diagnostic -Wvla-potential-size-confusion (grouped under -Wvla). This diagnoses code like:
int n;
void func(int array[n], int n);
to let the user know that the 'n' in 'array[n]' references the file scope variable and not the parameter.