Skip to content

Conversation

@jurahul
Copy link
Contributor

@jurahul jurahul commented Sep 20, 2024

Support auto when declaring class or def fields. Make auto a TableGen keyword and allow using it as a type when declaring class or def fields. When auto is used, the field must have an initializer from which its type can be inferred.

@jurahul jurahul marked this pull request as ready for review September 20, 2024 17:34
@jurahul jurahul requested review from arsenm and topperc September 20, 2024 17:35
@llvmbot
Copy link
Member

llvmbot commented Sep 20, 2024

@llvm/pr-subscribers-tablegen

Author: Rahul Joshi (jurahul)

Changes

Support 'auto' type when declaring class or def fields. Make 'auto' a TableGen keyword and allow using it as a type when declaring class or def fields. When auto is used, the field must have an initializer from which its type can be infered.


Full diff: https://github.com/llvm/llvm-project/pull/109434.diff

5 Files Affected:

  • (modified) llvm/lib/TableGen/TGLexer.cpp (+1)
  • (modified) llvm/lib/TableGen/TGLexer.h (+1)
  • (modified) llvm/lib/TableGen/TGParser.cpp (+47-11)
  • (modified) llvm/lib/TableGen/TGParser.h (+2-1)
  • (added) llvm/test/TableGen/auto.td (+44)
diff --git a/llvm/lib/TableGen/TGLexer.cpp b/llvm/lib/TableGen/TGLexer.cpp
index 62a884e01a5306..9f23f33ca6dbab 100644
--- a/llvm/lib/TableGen/TGLexer.cpp
+++ b/llvm/lib/TableGen/TGLexer.cpp
@@ -383,6 +383,7 @@ tgtok::TokKind TGLexer::LexIdentifier() {
   StringRef Str(IdentStart, CurPtr-IdentStart);
 
   tgtok::TokKind Kind = StringSwitch<tgtok::TokKind>(Str)
+                            .Case("auto", tgtok::Auto)
                             .Case("int", tgtok::Int)
                             .Case("bit", tgtok::Bit)
                             .Case("bits", tgtok::Bits)
diff --git a/llvm/lib/TableGen/TGLexer.h b/llvm/lib/TableGen/TGLexer.h
index 9adc03ccc72b85..f6d8f0c606da75 100644
--- a/llvm/lib/TableGen/TGLexer.h
+++ b/llvm/lib/TableGen/TGLexer.h
@@ -75,6 +75,7 @@ enum TokKind {
 
   // Reserved keywords. ('ElseKW' is named to distinguish it from the
   // existing 'Else' that means the preprocessor #else.)
+  Auto,
   Bit,
   Bits,
   Code,
diff --git a/llvm/lib/TableGen/TGParser.cpp b/llvm/lib/TableGen/TGParser.cpp
index 1a60c2a567a297..262bf2656e86ed 100644
--- a/llvm/lib/TableGen/TGParser.cpp
+++ b/llvm/lib/TableGen/TGParser.cpp
@@ -3240,13 +3240,19 @@ bool TGParser::ParseTemplateArgValueList(
 ///
 ///  Declaration ::= FIELD? Type ID ('=' Value)?
 ///
-Init *TGParser::ParseDeclaration(Record *CurRec,
-                                       bool ParsingTemplateArgs) {
+Init *TGParser::ParseDeclaration(Record *CurRec, bool ParsingTemplateArgs,
+                                 bool AllowAuto = false) {
   // Read the field prefix if present.
   bool HasField = consume(tgtok::Field);
 
-  RecTy *Type = ParseType();
-  if (!Type) return nullptr;
+  RecTy *Type;
+  if (AllowAuto && consume(tgtok::Auto)) {
+    Type = nullptr;
+  } else {
+    Type = ParseType();
+    if (!Type)
+      return nullptr;
+  }
 
   if (Lex.getCode() != tgtok::Id) {
     TokError("Expected identifier in declaration");
@@ -3268,6 +3274,30 @@ Init *TGParser::ParseDeclaration(Record *CurRec,
   Init *DeclName = StringInit::get(Records, Str);
   Lex.Lex();
 
+  bool HasValue = consume(tgtok::equal);
+
+  // When 'auto' is used, parse the value and infer the type based on the
+  // value's type.
+  SMLoc ValLoc;
+  Init *Val = nullptr;
+  if (!Type) {
+    if (!HasValue) {
+      // auto used without a value.
+      TokError("auto requires assigning a value");
+      return nullptr;
+    }
+    ValLoc = Lex.getLoc();
+    Val = ParseValue(CurRec, Type);
+
+    // Infer type from the value.
+    if (TypedInit *TI = dyn_cast_or_null<TypedInit>(Val)) {
+      Type = TI->getType();
+    } else {
+      Error(ValLoc, "unable to infer type");
+      return nullptr;
+    }
+  }
+
   bool BadField;
   if (!ParsingTemplateArgs) { // def, possibly in a multiclass
     BadField = AddValue(CurRec, IdLoc,
@@ -3289,10 +3319,14 @@ Init *TGParser::ParseDeclaration(Record *CurRec,
   if (BadField)
     return nullptr;
 
-  // If a value is present, parse it and set new field's value.
-  if (consume(tgtok::equal)) {
-    SMLoc ValLoc = Lex.getLoc();
-    Init *Val = ParseValue(CurRec, Type);
+  // For the non-auto case, if a value is present, parse it.
+  if (!Val && HasValue) {
+    ValLoc = Lex.getLoc();
+    Val = ParseValue(CurRec, Type);
+  }
+
+  // Set the new field's value.
+  if (HasValue) {
     if (!Val ||
         SetValue(CurRec, ValLoc, DeclName, {}, Val,
                  /*AllowSelfAssignment=*/false, /*OverrideDefLoc=*/false)) {
@@ -3401,7 +3435,7 @@ bool TGParser::ParseTemplateArgList(Record *CurRec) {
   Record *TheRecToAddTo = CurRec ? CurRec : &CurMultiClass->Rec;
 
   // Read the first declaration.
-  Init *TemplArg = ParseDeclaration(CurRec, true/*templateargs*/);
+  Init *TemplArg = ParseDeclaration(CurRec, /*ParsingTemplateArgs=*/true);
   if (!TemplArg)
     return true;
 
@@ -3410,7 +3444,7 @@ bool TGParser::ParseTemplateArgList(Record *CurRec) {
   while (consume(tgtok::comma)) {
     // Read the following declarations.
     SMLoc Loc = Lex.getLoc();
-    TemplArg = ParseDeclaration(CurRec, true/*templateargs*/);
+    TemplArg = ParseDeclaration(CurRec, /*ParsingTemplateArgs=*/true);
     if (!TemplArg)
       return true;
 
@@ -3445,7 +3479,9 @@ bool TGParser::ParseBodyItem(Record *CurRec) {
     return ParseDump(nullptr, CurRec);
 
   if (Lex.getCode() != tgtok::Let) {
-    if (!ParseDeclaration(CurRec, false))
+    // Allow 'auto' when parsing declarations in the body of a def or class.
+    if (!ParseDeclaration(CurRec, /*ParsingTemplateArgs=*/false,
+                          /*AllowAuto=*/true))
       return true;
 
     if (!consume(tgtok::semi))
diff --git a/llvm/lib/TableGen/TGParser.h b/llvm/lib/TableGen/TGParser.h
index b08e250870901a..7101cd43b53d56 100644
--- a/llvm/lib/TableGen/TGParser.h
+++ b/llvm/lib/TableGen/TGParser.h
@@ -280,7 +280,8 @@ class TGParser {
   bool ParseBodyItem(Record *CurRec);
 
   bool ParseTemplateArgList(Record *CurRec);
-  Init *ParseDeclaration(Record *CurRec, bool ParsingTemplateArgs);
+  Init *ParseDeclaration(Record *CurRec, bool ParsingTemplateArgs,
+                         bool AllowAuto);
   VarInit *ParseForeachDeclaration(Init *&ForeachListValue);
 
   SubClassReference ParseSubClassReference(Record *CurRec, bool isDefm);
diff --git a/llvm/test/TableGen/auto.td b/llvm/test/TableGen/auto.td
new file mode 100644
index 00000000000000..b05af8d5c16a7d
--- /dev/null
+++ b/llvm/test/TableGen/auto.td
@@ -0,0 +1,44 @@
+// RUN: llvm-tblgen %s | FileCheck %s --check-prefix=CHECK-PASS
+// RUN: not llvm-tblgen -DBAD0 %s 2>&1 | FileCheck %s --check-prefix=CHECK-BAD0 -DFILE=%s
+// RUN: not llvm-tblgen -DBAD1 %s 2>&1 | FileCheck %s --check-prefix=CHECK-BAD1 -DFILE=%s
+
+// CHECK-PASS: int Arg = A:P;
+class A<int P> {
+  auto Arg = P;
+}
+
+def MyOp;
+
+// CHECK-PASS-LABEL: class B
+// CHECK-PASS:  int A = 10;
+// CHECK-PASS:  string B = "x";
+// CHECK-PASS:  list<int> C = [10, 10];
+// CHECK-PASS{LITERAL}:  list<list<int>> D = [[10, 10], [11, 11]];
+// CHECK-PASS:  list<A> F = [anonymous_0, anonymous_1];
+// CHECK-PASS:  bits<4> G = { 0, 1, 0, 1 };
+// CHECK-PASS:  code H = [{ printf(); }];
+// CHECK-PASS:  bits<1> I = { 0 };
+// CHECK-PASS:  dag K = (MyOp 10, 100);
+// CHECK-PASS:  int L = 10;
+class B {
+  auto A = 10;
+  auto B = "x";
+  auto C = [10,10];
+  auto D = [[10,10],[11,11]];
+#ifdef BAD0
+  // CHECK-BAD0: [[FILE]]:[[@LINE+1]]:12: error: unable to infer type
+  auto E = [];
+#endif  
+  auto F = [A<10>, A<11>];
+  auto G = {0,1,0,1};
+  auto H = [{ printf(); }];
+  // FIXME: This becomes `bits` and not `bit`.
+  auto I = 0b0;
+#ifdef BAD1
+  // CHECK-BAD1: [[FILE]]:[[@LINE+1]]:12: error: unable to infer type
+  auto J = ?;
+#endif  
+  auto K = (MyOp 10, 100);
+  auto L = A<10>.Arg;
+}
+

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of auto in general, but for tablegen I think it will only make it more difficult to read. It's already hard enough to figure out what's going on without looking at the fully expanded form.

@jurahul
Copy link
Contributor Author

jurahul commented Sep 24, 2024

The same could be said about C++ auto. Just like that, it's a tool to be used judiciously. For simple things like list of string or ints, I'd think it's convenient without added ambiguity.

@jurahul
Copy link
Contributor Author

jurahul commented Sep 24, 2024

That being said, there is some minor rework I want do with this MR, so will do that and refresh it.

@jurahul
Copy link
Contributor Author

jurahul commented Sep 24, 2024

Especially for code like:

class Intrinsic<list<LLVMType> ret_types,
                list<LLVMType> param_types = [],
                list<IntrinsicProperty> intr_properties = [],
                string name = "",
                list<SDNodeProperty> sd_properties = [],
                bit disable_default_attributes = true> : SDPatternOperator {
  string LLVMName = name;
  string TargetPrefix = "";   // Set to a prefix for target-specific intrinsics.
  list<LLVMType> RetTypes = ret_types;
  list<LLVMType> ParamTypes = param_types;

where param_type and ret_types have their type declared, so we can avoid repeating it again in RetTypes and ParamTypes.

Support 'auto' type when declaring class or def fields. Make 'auto'
a TableGen keyword and allow using it as a type when declaring class
or def fields. When auto is used, the field must have an initializer
from which its type can be infered.
@jurahul jurahul force-pushed the tg_auto_type_deduction branch from 217e756 to 0cdb0ed Compare September 24, 2024 16:50
@jurahul
Copy link
Contributor Author

jurahul commented Sep 24, 2024

I did the minor tweak. Let me also start a thread on discourse to see if other folks share the same feeling about auto in TD files. and is so we can drop this :(

@jurahul
Copy link
Contributor Author

jurahul commented Sep 24, 2024

I started a thread to discuss this here: https://discourse.llvm.org/t/rfc-auto-support-in-tablegen-files/81408

@fpetrogalli
Copy link
Member

Especially for code like:

class Intrinsic<list<LLVMType> ret_types,
                list<LLVMType> param_types = [],
                list<IntrinsicProperty> intr_properties = [],
                string name = "",
                list<SDNodeProperty> sd_properties = [],
                bit disable_default_attributes = true> : SDPatternOperator {
  string LLVMName = name;
  string TargetPrefix = "";   // Set to a prefix for target-specific intrinsics.
  list<LLVMType> RetTypes = ret_types;
  list<LLVMType> ParamTypes = param_types;

where param_type and ret_types have their type declared, so we can avoid repeating it again in RetTypes and ParamTypes.

When someone adds more fields before string LLVMName, the fields with auto will move further away and it will not as easy as before to know the types from the context.

@jurahul
Copy link
Contributor Author

jurahul commented Oct 1, 2024

Agreed, but that just means that the use of auto here needs to be reevaluated as code changes. Just because a hammer cannot drill holes doesn't mean that a hammer is not useful. IMO auto is a tool that can be convenient to use when it makes sense, similar to C++ auto. But it seems not a lot of folks agree or care about, so this may not be worth pursuing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants