Insights

How to Diagnose if Your Traefik Container is Unhealthy

Step-by-step ways to determine what the real problem is

Troubleshooting Common Docker Errors During Project Setup

Every now and then, you may encounter an issue where while performing a docker-compose up -d or an .\up.ps1 where it will error with the following message.

ERROR: traefik Container "06bb2d97202e" is unhealthy.
ERROR: Encountered errors while bringing up the project.

It can be extremely annoying and can, if not approaching it the most efficient way, cost you dearly in time management. It may occur inexplicably. And sometimes the simplest solution is all that is needed.

More often than not it typically has to do with whatever was done last in your project or from the most recent pull.

Solution 1: Perform a Down and Reboot

It’s not a guarantee but there are times where simply rebooting your system can resolve the error. Before you do reboot your system however I highly recommend you perform a .\down.ps1 or a docker-compose down such that you don’t have any active containers.

Solution 2: Verify Enough Disk Space Exists

If you have been running Docker long enough, and you are on perhaps an older version, if your system does shut down or reboot and your container hasn’t performed a proper shutdown, it can eat up disk space fairly quickly. As such, given the requirements of running an XM Cloud Local Environment, if this not enough disk space, it can produce an error that traefik cannot be started.

Solution 3: Verify a Valid License File

Yet another reason, while it seems fairly obvious, but your license file may have expired or may not be in the location provided during the initialization of the environment. This will show up in the CM logs. All it takes is for someone to modify the docker-compose files to specify the location and while you wouldn’t think this may be an issue, it can occur when you are working in a larger group of developers.

Solution 4: Ensure Conflicting Processes Are Shut Down

This includes things like SQL Server, IIS Server or Solr instance. Any of these can cause conflicts with your containers and need to be shut down prior to performing a docker-compose up or a .\up.ps1.

You can run the following commands to ensure you don’t have additional processes running that may interfere. The following will check web services running on port 443.

Get-Process -Id (Get-NetTCPConnection -LocalPort 443).OwningProcess

This will check for services such as Solr.

Get-Process -Id (Get-NetTCPConnection -LocalPort 8984).OwningProcess

Solution X: What if It’s None of Those?

If you’ve performed the previous three solutions, you’re probably like, ok, done that, what’s next? How do I figure out really what the problem is?

One thing you can do is you’ll notice that the container that is actually the problem is referenced. There by, you can perform the following command that will tell you which container is the source of the error.

docker ps

You should see the following.

Screenshot of Docker terminal displaying containers running Sitecore images with status updates

With the container ID from the initial number, you’re now able to accurately determine which container is the one causing traefik to fail. So follow the next steps to determine the error that is occurring.

  1. Open up Docker Desktop
  2. Open the offending container to view the log files.

At this point diagnosing the issue becomes very similar to how to diagnose issues within a local instance of Sitecore XP or XM. One challenge I did find recently was my CM instance wasn’t displaying the error. It was due to a misspelling in my config patch file, but my CM logs never showed the error that was occurring.

Docker Desktop UI showing logs of Sitecore tasks and jobs including health checks and indexing operations

  1. Knowing that, recognizing that traefik works top down, via the rendering container, I next checked the logs for the that container ID. Low and behold, the error was shown as part of the build process for the front-end.

The main thing to recognize is that the error isn’t always where you’d expect it to be. We had a CM issue, but the error, for whatever reason, wasn’t being displayed within the CM container, but rather the rendering container.



Meet David Austin

Development Team Lead | Sitecore Technology MVP x 3

📷🕹️👪

David is a decorated Development Team Lead with Sitecore Technology MVP and Coveo MVP awards, as well as Sitecore CDP & Personalize Certified. He's worked in IT for 25 years; everything ranging from Developer to Business Analyst to Group Lead helping manage everything from Intranet and Internet sites to facility management and application support. David is a dedicated family man who loves to spend time with his girls. He's also an avid photographer and loves to explore new places.

Connect with David