Note
Go to the end to download the full example code
Configuring Logging Links in UI#
To debug your workflows in production, you want to access logs from your tasks as they run. These logs are different from the core Flyte platform logs, are specific to execution, and may vary from plugin to plugin; for example, Spark may have driver and executor logs.
Every organization potentially uses different log aggregators, making it hard to create a one-size-fits-all solution. Some examples of the log aggregators include cloud-hosted solutions like AWS CloudWatch, GCP Stackdriver, Splunk, Datadog, etc.
Flyte provides a simplified interface to configure your log provider. Flyte-sandbox ships with the Kubernetes dashboard to visualize the logs. This may not be safe for production, hence we recommend users explore other log aggregators.
How to configure?#
To configure your log provider, the provider needs to support URL links that are shareable and can be templatized. The templating engine has access to these parameters.
The parameters can be used to generate a unique URL to the logs using a templated URI that pertain to a specific task. The templated URI has access to the following parameters:
Parameter |
Description |
---|---|
|
Gets the pod name as it shows in k8s dashboard |
|
The pod UID generated by the k8s at runtime |
|
K8s namespace where the pod runs |
|
The container name that generated the log |
|
The container id docker/crio generated at run time |
|
A deployment specific name where to expect the logs to be |
|
The hostname where the pod is running and logs reside |
|
The pod creation time (in unix seconds, not millis) |
|
Don’t have a good mechanism for this yet, but approximating with |
The parameterization engine uses Golangs native templating format and hence uses {{ }}
. An example configuration can be seen as follows:
task_logs:
plugins:
logs:
templates:
- displayName: <name-to-show>
templateUris:
- "https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/flyte-production/kubernetes;stream=var.log.containers.{{.podName}}_{{.namespace}}_{{.containerName}}-{{.containerId}}.log"
- "https://some-other-source/home?region=us-east-1#logEventViewer:group=/flyte-production/kubernetes;stream=var.log.containers.{{.podName}}_{{.namespace}}_{{.containerName}}-{{.containerId}}.log"
messageFormat: 0 # this parameter is optional, but use 0 for "unknown", 1 for "csv", or 2 for "json"
Tip
Since helm chart uses the same templating syntax for args (like {{ }}
), compiling the chart results in helm replacing Flyte log link templates as well. To avoid this, you can use escaped templating for Flyte logs in the helm chart.
This ensures that Flyte log link templates remain in place during helm chart compilation.
For example:
If your configuration looks like this:
https://someexample.com/app/podName={{ "{{" }} .podName {{ "}}" }}&containerName={{ .containerName }}
Helm chart will generate:
https://someexample.com/app/podName={{.podName}}&containerName={{.containerName}}
Flytepropeller pod would be created as:
https://someexample.com/app/podName=pname&containerName=cname
This code snippet will output two logs per task that use the log plugin. However, not all task types use the log plugin; for example, the SageMaker plugin uses the log output provided by Sagemaker, and the Snowflake plugin will use a link to the snowflake console.
Datadog Integration#
To send your Flyte workflow logs to Datadog, you can follow these steps:
Enable collection of logs from containers and collection of logs using files. The precise configuration steps will vary depending on your specific setup.
For instance, if you’re using Helm, use the following config:
logs:
enabled: true
containerCollectAll: true
containerCollectUsingFiles: true
If you’re using environment variables, use the following config:
DD_LOGS_ENABLED: "false"
DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL: "true"
DD_LOGS_CONFIG_K8S_CONTAINER_USE_FILE: "true"
DD_CONTAINER_EXCLUDE_LOGS: "name:datadog-agent" # This is to avoid tracking logs produced by the datadog agent itself
Warning
The boolean values have to be represented as strings.
The Datadog guide includes a section on mounting volumes. It is essential (and a prerequisite for proper functioning) to map the volumes “logpodpath” and “logcontainerpath” as illustrated in the linked example. While the “pointerdir” volume is optional, it is recommended that you map it to prevent the loss of container logs during restarts or network issues (as stated in the guide).
Total running time of the script: ( 0 minutes 0.000 seconds)