Skip to content

Error while using add_calculated_column #1180

@stefanodallura

Description

@stefanodallura

Describe the bug
I get this error while using tom.add_calculated_column :
AttributeError
'CalculatedColumn' object has no attribute 'SourceColumn'

To Reproduce
I run the code below where i read a configuration table.
The configuration table stored metadata about data columns (with sourcecolumn defined) and calculated columns (with no sourcecolumn, but with expression defined).

#########BEGIN NOTEBOOK###########
from sempy_labs.tom import connect_semantic_model

def add_columns_from_configuration(tableNameConfig, dataset, workspace):

# Read the configuration table; sorted by SORT_BY to handle dependencies between columns
config_df = spark.read.table(tableNameConfig).filter(col("SEMANTICMODEL_NAME")==semantic_model_name).orderBy(col("SORT_BY").asc_nulls_first())

# Connect to the semantic model
with connect_semantic_model(dataset=dataset, readonly=False, workspace=workspace) as tom:
    # Iterate through each row in the configuration DataFrame
    for row in config_df.collect():
        table_name = row['SEMANTICMODEL_TABLE']
        column_name = row['SEMANTICMODEL_COLUMN']
        source_column = row['SOURCE_COLUMN']
        data_type = row['DATA_TYPE']
        format_string = row['FORMAT_STRING'] if 'FORMAT_STRING' in row else None
        hidden = bool(row['HIDDEN']) if 'HIDDEN' in row else False
        display_folder = row['DISPLAY_FOLDER'] if 'DISPLAY_FOLDER' in row else None
        summarize_by = row['SUMMARIZE_BY'] if 'SUMMARIZE_BY' in row else None
        formula = row['FORMULA'] if 'FORMULA' in row else None
        sort_by_column = row['SORT_BY'] if 'SORT_BY' in row else None
        
        if formula:
            # Add a calculated column to the semantic model
            tom.add_calculated_column(
                table_name=table_name,
                column_name=column_name,
                expression=formula,
                data_type=data_type,
                format_string=format_string,
                hidden=hidden,
                display_folder=display_folder,
                summarize_by=summarize_by
            )
            print(f"Added calculated column: {column_name} to table: {table_name}")
        else:
            # Add a physical data column to the semantic model
            tom.add_data_column(
                table_name=table_name,
                column_name=column_name,
                source_column=source_column,
                data_type=data_type,
                format_string=format_string,
                hidden=hidden,
                display_folder=display_folder,
                summarize_by=summarize_by
            )
            print(f"Added data column: {column_name} to table: {table_name}")
            
        if sort_by_column:
            # Apply sorting configuration to the column
            tom.set_sort_by_column(
                table_name=table_name,
                column_name=column_name,
                sort_by_column=sort_by_column
            )
            print(f"Added sorting by: {sort_by_column} to column: {column_name}")

Example usage

config_file_path = tableNameConfigColumns
dataset_name = semantic_model_name
workspace_name = semantic_model_workspace
add_columns_from_configuration(config_file_path, dataset_name, workspace_name)

#########END NOTEBOOK###########

No issue while using tom.add_data_column.

I get error when I use tom.add_calculated_column

This is the traceback from Fabric

AttributeError
'CalculatedColumn' object has no attribute 'SourceColumn'

AttributeError Traceback (most recent call last)
Cell In[37], line 65
63 dataset_name = semantic_model_name
64 workspace_name = semantic_model_workspace
---> 65 add_columns_from_configuration(config_file_path, dataset_name, workspace_name)

Cell In[37], line 11, in add_columns_from_configuration(tableNameConfig, dataset, workspace)
8 config_df = spark.read.table(tableNameConfig).filter(col("SEMANTICMODEL_NAME")==semantic_model_name).orderBy(col("SORT_BY").asc_nulls_first())
10 # Connect to the semantic model
---> 11 with connect_semantic_model(dataset=dataset, readonly=False, workspace=workspace) as tom:
12 # Iterate through each row in the configuration DataFrame
13 for row in config_df.collect():
14 table_name = row['SEMANTICMODEL_TABLE']

File ~/cluster-env/trident_env/lib/python3.11/contextlib.py:144, in _GeneratorContextManager.exit(self, typ, value, traceback)
142 if typ is None:
143 try:
--> 144 next(self.gen)
145 except StopIteration:
146 return False

File ~/cluster-env/trident_env/lib/python3.11/site-packages/sempy_labs/tom/_model.py:6069, in connect_semantic_model(dataset, readonly, workspace)
6067 yield tw
6068 finally:
-> 6069 tw.close()

File ~/cluster-env/trident_env/lib/python3.11/site-packages/sempy_labs/tom/_model.py:5981, in TOMWrapper.close(self)
5975 for c in self.all_columns():
5976 # if c.LineageTag in list(self._column_map.keys()):
5977 if any(
5978 p.SourceType == TOM.PartitionSourceType.Entity
5979 for p in c.Parent.Partitions
5980 ):
-> 5981 if c.Name != c.SourceColumn:
5982 self.add_changed_property(object=c, property="Name")
5983 # c.SourceLineageTag = c.SourceColumn
5984 # if self._column_map.get(c.LineageTag)[0] != c.Name:
5985 # self.add_changed_property(object=c, property="Name")

AttributeError: 'CalculatedColumn' object has no attribute 'SourceColumn'

Expected behavior
I expect the calculated column is added to the direct lake semantic model

Additional context
Context is the creation of a semantic model with direct lake.
e.g.
labs.directlake.generate_direct_lake_semantic_model(
semantic_model_name,
singletable,
semantic_model_workspace,
lakehouse,
workspace,
'dbo',
True,
True
)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions