@@ -60,6 +60,28 @@ See [contributing instructions](CONTRIBUTING.md) to help improve this project.
6060 * [ <b >Always run this workflow AFTER the assessment has finished</b >] ( #balways-run-this-workflow-after-the-assessment-has-finishedb )
6161 * [[ EXPERIMENTAL] Migrate tables in mounts Workflow] ( #experimental-migrate-tables-in-mounts-workflow )
6262 * [ Jobs Static Code Analysis Workflow] ( #jobs-static-code-analysis-workflow )
63+ * [ Linter message codes] ( #linter-message-codes )
64+ * [ ` cannot-autofix-table-reference ` ] ( #cannot-autofix-table-reference )
65+ * [ ` catalog-api-in-shared-clusters ` ] ( #catalog-api-in-shared-clusters )
66+ * [ ` changed-result-format-in-uc ` ] ( #changed-result-format-in-uc )
67+ * [ ` dbfs-read-from-sql-query ` ] ( #dbfs-read-from-sql-query )
68+ * [ ` dbfs-usage ` ] ( #dbfs-usage )
69+ * [ ` default-format-changed-in-dbr8 ` ] ( #default-format-changed-in-dbr8 )
70+ * [ ` dependency-not-found ` ] ( #dependency-not-found )
71+ * [ ` direct-filesystem-access ` ] ( #direct-filesystem-access )
72+ * [ ` implicit-dbfs-usage ` ] ( #implicit-dbfs-usage )
73+ * [ ` jvm-access-in-shared-clusters ` ] ( #jvm-access-in-shared-clusters )
74+ * [ ` legacy-context-in-shared-clusters ` ] ( #legacy-context-in-shared-clusters )
75+ * [ ` not-supported ` ] ( #not-supported )
76+ * [ ` notebook-run-cannot-compute-value ` ] ( #notebook-run-cannot-compute-value )
77+ * [ ` python-udf-in-shared-clusters ` ] ( #python-udf-in-shared-clusters )
78+ * [ ` rdd-in-shared-clusters ` ] ( #rdd-in-shared-clusters )
79+ * [ ` spark-logging-in-shared-clusters ` ] ( #spark-logging-in-shared-clusters )
80+ * [ ` sql-parse-error ` ] ( #sql-parse-error )
81+ * [ ` sys-path-cannot-compute-value ` ] ( #sys-path-cannot-compute-value )
82+ * [ ` table-migrated-to-uc ` ] ( #table-migrated-to-uc )
83+ * [ ` to-json-in-shared-clusters ` ] ( #to-json-in-shared-clusters )
84+ * [ ` unsupported-magic-line ` ] ( #unsupported-magic-line )
6385* [ Utility commands] ( #utility-commands )
6486 * [ ` logs ` command] ( #logs-command )
6587 * [ ` ensure-assessment-run ` command] ( #ensure-assessment-run-command )
@@ -678,6 +700,286 @@ in the Migration dashboard.
678700
679701[[ back to top] ( #databricks-labs-ucx )]
680702
703+ ### Linter message codes
704+
705+ Here's the detailed explanation of the linter message codes:
706+
707+ #### ` cannot-autofix-table-reference `
708+
709+ This indicates that the linter has found a table reference that cannot be automatically fixed. The user must manually
710+ update the table reference to point to the correct table in Unity Catalog. This mostly occurs, when table name is
711+ computed dynamically, and it's too complex for our static code analysis to detect it. We detect this problem anywhere
712+ where table name could be used: ` spark.sql ` , ` spark.catalog.* ` , ` spark.table ` , ` df.write.* ` and many more. Code examples
713+ that trigger this problem:
714+
715+ ``` python
716+ spark.table(f " foo_ { some_table_name} " )
717+ # ..
718+ df = spark.range(10 )
719+ df.write.saveAsTable(f " foo_ { some_table_name} " )
720+ # .. or even
721+ df.write.insertInto(f " foo_ { some_table_name} " )
722+ ```
723+
724+ Here the ` some_table_name ` variable is not defined anywhere in the visible scope. Though, the analyser would
725+ successfully detect table name if it is defined:
726+
727+ ``` python
728+ some_table_name = ' bar'
729+ spark.table(f " foo_ { some_table_name} " )
730+ ```
731+
732+ We even detect string constants when coming either from ` dbutils.widgets.get ` (via job named parameters) or through
733+ loop variables. If ` old.things ` table is migrated to ` brand.new.stuff ` in Unity Catalog, the following code will
734+ trigger two messages: [ ` table-migrated-to-uc ` ] ( #table-migrated-to-uc ) for the first query, as the contents are clearly
735+ analysable, and ` cannot-autofix-table-reference ` for the second query.
736+
737+ ``` python
738+ # ucx[table-migrated-to-uc:+4:4:+4:20] Table old.things is migrated to brand.new.stuff in Unity Catalog
739+ # ucx[cannot-autofix-table-reference:+3:4:+3:20] Can't migrate table_name argument in 'spark.sql(query)' because its value cannot be computed
740+ table_name = f " table_ { index} "
741+ for query in [" SELECT * FROM old.things" , f " SELECT * FROM { table_name} " ]:
742+ spark.sql(query).collect()
743+ ```
744+
745+ [[ back to top] ( #databricks-labs-ucx )]
746+
747+ #### ` catalog-api-in-shared-clusters `
748+
749+ ` spark.catalog.* ` functions require Databricks Runtime 14.3 LTS or above on Unity Catalog clusters in Shared access
750+ mode, so of your code has ` spark.catalog.tableExists("table") ` or ` spark.catalog.listDatabases() ` , you need to ensure
751+ that your cluster is running the correct runtime version and data security mode.
752+
753+ [[ back to top] ( #databricks-labs-ucx )]
754+
755+ #### ` changed-result-format-in-uc `
756+
757+ Calls to these functions would return a list of ` <catalog>.<database>.<table> ` instead of ` <database>.<table> ` . So if
758+ you have code like this:
759+
760+ ``` python
761+ for table in spark.catalog.listTables():
762+ do_stuff_with_table(table)
763+ ```
764+
765+ you need to make sure that ` do_stuff_with_table ` can handle the new format.
766+
767+ [[ back to top] ( #databricks-labs-ucx )]
768+
769+ #### ` dbfs-read-from-sql-query `
770+
771+ DBFS access is not allowed in Unity Catalog, so if you have code like this:
772+
773+ ``` python
774+ df = spark.sql(" SELECT * FROM parquet.`/mnt/foo/path/to/file`" )
775+ ```
776+
777+ you need to change it to use UC tables.
778+
779+ [[ back to top] ( #databricks-labs-ucx )]
780+
781+ #### ` dbfs-usage `
782+
783+ DBFS does not work in Unity Catalog, so if you have code like this:
784+
785+ ``` python
786+ display(spark.read.csv(' /mnt/things/e/f/g' ))
787+ ```
788+
789+ You need to change it to use UC tables or UC volumes.
790+
791+ [[ back to top] ( #databricks-labs-ucx )]
792+
793+ #### ` dependency-not-found `
794+
795+ This message indicates that the linter has found a dependency, like Python source file or a notebook, that is not
796+ available in the workspace. The user must ensure that the dependency is available in the workspace. This usually
797+ means an error in the user code.
798+
799+ [[ back to top] ( #databricks-labs-ucx )]
800+
801+ #### ` direct-filesystem-access `
802+
803+ It's not allowed to access the filesystem directly in Unity Catalog, so if you have code like this:
804+
805+ ``` python
806+ spark.read.csv(" s3://bucket/path" )
807+ ```
808+
809+ you need to change it to use UC tables or UC volumes.
810+
811+ [[ back to top] ( #databricks-labs-ucx )]
812+
813+ #### ` implicit-dbfs-usage `
814+
815+ The use of DBFS is not allowed in Unity Catalog, so if you have code like this:
816+
817+ ``` python
818+ display(spark.read.csv(' /mnt/things/e/f/g' ))
819+ ```
820+
821+ you need to change it to use UC tables or UC volumes.
822+
823+ [[ back to top] ( #databricks-labs-ucx )]
824+
825+ #### ` jvm-access-in-shared-clusters `
826+
827+ You cannot access Spark Driver JVM on Unity Catalog clusters in Shared Access mode. If you have code like this:
828+
829+ ``` python
830+ spark._jspark._jvm.com.my.custom.Name()
831+ ```
832+
833+ or like this:
834+
835+ ``` python
836+ log4jLogger = sc._jvm.org.apache.log4j
837+ LOGGER = log4jLogger.LogManager.getLogger(__name__ )
838+ ```
839+
840+ you need to change it to use Python equivalents.
841+
842+ [[ back to top] ( #databricks-labs-ucx )]
843+
844+ #### ` legacy-context-in-shared-clusters `
845+
846+ SparkContext (` sc ` ) is not supported on Unity Catalog clusters in Shared access mode. Rewrite it using SparkSession
847+ (` spark ` ). Example code that triggers this message:
848+
849+ ``` python
850+ df = spark.createDataFrame(sc.emptyRDD(), schema)
851+ ```
852+
853+ or this:
854+
855+ ``` python
856+ sc.parallelize([1 , 2 , 3 ])
857+ ```
858+
859+ [[ back to top] ( #databricks-labs-ucx )]
860+
861+ #### ` not-supported `
862+
863+ Installing eggs is no longer supported on Databricks 14.0 or higher.
864+
865+ [[ back to top] ( #databricks-labs-ucx )]
866+
867+ #### ` notebook-run-cannot-compute-value `
868+
869+ Path for ` dbutils.notebook.run ` cannot be computed and requires adjusting the notebook path.
870+ It is not clear for automated code analysis where the notebook is located, so you need to simplify the code like:
871+
872+ ``` python
873+ b = some_function()
874+ dbutils.notebook.run(b)
875+ ```
876+
877+ to something like this:
878+
879+ ``` python
880+ a = " ./leaf1.py"
881+ dbutils.notebook.run(a)
882+ ```
883+
884+ [[ back to top] ( #databricks-labs-ucx )]
885+
886+ #### ` python-udf-in-shared-clusters `
887+
888+ ` applyInPandas ` requires DBR 14.3 LTS or above on Unity Catalog clusters in Shared access mode. Example:
889+
890+ ``` python
891+ df.groupby(" id" ).applyInPandas(subtract_mean, schema = " id long, v double" ).show()
892+ ```
893+
894+ Arrow UDFs require DBR 14.3 LTS or above on Unity Catalog clusters in Shared access mode.
895+
896+ ``` python
897+ @udf (returnType = ' int' , useArrow = True )
898+ def arrow_slen (s ):
899+ return len (s)
900+ ```
901+
902+ It is not possible to register Java UDF from Python code on Unity Catalog clusters in Shared access mode. Use a
903+ ` %scala ` cell to register the Scala UDF using ` spark.udf.register ` . Example code that triggers this message:
904+
905+ ``` python
906+ spark.udf.registerJavaFunction(" func" , " org.example.func" , IntegerType())
907+ ```
908+
909+ [[ back to top] ( #databricks-labs-ucx )]
910+
911+ #### ` rdd-in-shared-clusters `
912+
913+ RDD APIs are not supported on Unity Catalog clusters in Shared access mode. Use mapInArrow() or Pandas UDFs instead.
914+
915+ ``` python
916+ df.rdd.mapPartitions(myUdf)
917+ ```
918+
919+ [[ back to top] ( #databricks-labs-ucx )]
920+
921+ #### ` spark-logging-in-shared-clusters `
922+
923+ Cannot set Spark log level directly from code on Unity Catalog clusters in Shared access mode. Remove the call and set
924+ the cluster spark conf ` spark.log.level ` instead:
925+
926+ ``` python
927+ sc.setLogLevel(" INFO" )
928+ setLogLevel(" WARN" )
929+ ```
930+
931+ Another example could be:
932+
933+ ``` python
934+ log4jLogger = sc._jvm.org.apache.log4j
935+ LOGGER = log4jLogger.LogManager.getLogger(__name__ )
936+ ```
937+
938+ or
939+
940+ ``` python
941+ sc._jvm.org.apache.log4j.LogManager.getLogger(__name__ ).info(" test" )
942+ ```
943+
944+ [[ back to top] ( #databricks-labs-ucx )]
945+
946+ #### ` sql-parse-error `
947+
948+ This is a generic message indicating that the SQL query could not be parsed. The user must manually check the SQL query.
949+
950+ [[ back to top] ( #databricks-labs-ucx )]
951+
952+ #### ` sys-path-cannot-compute-value `
953+
954+ Path for ` sys.path.append ` cannot be computed and requires adjusting the path. It is not clear for automated code
955+ analysis where the path is located.
956+
957+ [[ back to top] ( #databricks-labs-ucx )]
958+
959+ #### ` table-migrated-to-uc `
960+
961+ This message indicates that the linter has found a table that has been migrated to Unity Catalog. The user must ensure
962+ that the table is available in Unity Catalog.
963+
964+ [[ back to top] ( #databricks-labs-ucx )]
965+
966+ #### ` to-json-in-shared-clusters `
967+
968+ ` toJson() ` is not available on Unity Catalog clusters in Shared access mode. Use ` toSafeJson() ` on DBR 13.3 LTS or
969+ above to get a subset of command context information. Example code that triggers this message:
970+
971+ ``` python
972+ dbutils.notebook.entry_point.getDbutils().notebook().getContext().toSafeJson()
973+ ```
974+
975+ [[ back to top] ( #databricks-labs-ucx )]
976+
977+ #### ` unsupported-magic-line `
978+
979+ This message indicates the code that could not be analysed by UCX. User must check the code manually.
980+
981+ [[ back to top] ( #databricks-labs-ucx )]
982+
681983# Utility commands
682984
683985## ` logs ` command
0 commit comments