ThoughtWorks
  • Kontakt
  • Español
  • Português
  • English
  • 中文
Übersicht
  • Delivery Mindset trifft Software-Exzellenz

    Verfolgen Sie einen innovativen Ansatz in der Softwareentwicklung, um noch schneller erfolgreich zu sein.

    Erkenntnisgestützte Entscheidungsfindung

    Nutzen Sie Ihre Datenbestände, um neue Geschäftsmöglichkeiten zu erschließen.

  • Betriebsmodelle ohne Reibungsverluste

    Verbessern Sie die Fähigkeit Ihres Unternehmens, auf Veränderungen zu reagieren.

    Plattform Strategie

    Entwicklung dynamischer Technologieplattformen, die sich an Ihre Geschäftsstrategie anpassen.

  • Experience Design und innovative Produkte

    Liefern Sie schnell außergewöhnliche Produkte und Kundenerlebnisse. Entwickeln Sie Design und Funktion kontinuierlich weiter.

    Partnerschaften

    Nutzung unseres Netzwerks aus vertrauenswürdigen Partnern, um noch bessere Ergebnisse für unsere Kunden zu erzielen.

Übersicht
  • Automobil
  • Clientech, Energie und Versorgung
  • Banken und Versicherungen
  • Gesundheit
  • Medien
  • Non-Profit
  • Öffentlicher Sektor
  • Handel und E-Commerce
  • Reise und Transport
Übersicht

Unsere Empfehlungen

  • Technologie

    Ausführliche Betrachtungen neuer Technologien.

  • Business

    Aktuelle Business-Insights, Strategien und Impulse für digitale Querdenker.

  • Kultur

    Insights zu Karrieremöglichkeiten und unsere Sicht auf soziale Gerechtigkeit und Inklusivität.

Digitale Veröffentlichungen und Tools

  • Technology Radar

    Unser Leitfaden für aktuelle Technologietrends.

  • Perspectives

    Unsere Publikation für digitale Vordenker*innen

  • Digital Fluency Model

    Ein Modell zur Priorisierung digitaler Fähigkeiten, um für das Unvorhersehbare bereit zu sein.

  • Decoder

    Der Technology-Guide für Business Entscheider

Alle Insights

  • Artikel

    Expertenwissen für Ihr Unternehmen.

  • Blogs

    Persönliche Perspektiven von ThoughtWorkern aus aller Welt.

  • Bücher

    Stöbern Sie durch unsere umfangreiche Bibliothek.

  • Podcasts

    Spannende Gespräche über das Neueste aus Business und Technologie.

Übersicht
  • Bewerbungsprozess

    Finde heraus, was dich in unserem Bewerbungsprozess erwartet.

  • Hochschulabsovent*innen und Quereinsteiger*innen

    Dein Einstieg in die IT-Welt.

  • Stellenangebote

    Finde offene Stellen in deiner Region.

  • In Kontakt bleiben

    Abonniere unsere monatlichen Updates.

Übersicht
  • Konferenzen und Events
  • Diversity und Inclusion
  • Neuigkeiten
  • Open Source
  • Management
  • Social Change
  • Español
  • Português
  • English
  • 中文
ThoughtWorksMenü
  • schließen   ✕
  • Unsere Services
  • Unsere Kunden
  • Insights
  • Karriere
  • Über uns
  • Kontakt
  • Zurück
  • schließen   ✕
  • Übersicht
  • Delivery Mindset trifft Software-Exzellenz

    Verfolgen Sie einen innovativen Ansatz in der Softwareentwicklung, um noch schneller erfolgreich zu sein.

  • Experience Design und innovative Produkte

    Liefern Sie schnell außergewöhnliche Produkte und Kundenerlebnisse. Entwickeln Sie Design und Funktion kontinuierlich weiter.

  • Betriebsmodelle ohne Reibungsverluste

    Verbessern Sie die Fähigkeit Ihres Unternehmens, auf Veränderungen zu reagieren.

  • Erkenntnisgestützte Entscheidungsfindung

    Nutzen Sie Ihre Datenbestände, um neue Geschäftsmöglichkeiten zu erschließen.

  • Partnerschaften

    Nutzung unseres Netzwerks aus vertrauenswürdigen Partnern, um noch bessere Ergebnisse für unsere Kunden zu erzielen.

  • Plattform Strategie

    Entwicklung dynamischer Technologieplattformen, die sich an Ihre Geschäftsstrategie anpassen.

  • Zurück
  • schließen   ✕
  • Übersicht
  • Automobil
  • Clientech, Energie und Versorgung
  • Banken und Versicherungen
  • Gesundheit
  • Medien
  • Non-Profit
  • Öffentlicher Sektor
  • Handel und E-Commerce
  • Reise und Transport
  • Zurück
  • schließen   ✕
  • Übersicht
  • Unsere Empfehlungen

  • Technologie

    Ausführliche Betrachtungen neuer Technologien.

  • Business

    Aktuelle Business-Insights, Strategien und Impulse für digitale Querdenker.

  • Kultur

    Insights zu Karrieremöglichkeiten und unsere Sicht auf soziale Gerechtigkeit und Inklusivität.

  • Digitale Veröffentlichungen und Tools

  • Technology Radar

    Unser Leitfaden für aktuelle Technologietrends.

  • Perspectives

    Unsere Publikation für digitale Vordenker*innen

  • Digital Fluency Model

    Ein Modell zur Priorisierung digitaler Fähigkeiten, um für das Unvorhersehbare bereit zu sein.

  • Decoder

    Der Technology-Guide für Business Entscheider

  • Alle Insights

  • Artikel

    Expertenwissen für Ihr Unternehmen.

  • Blogs

    Persönliche Perspektiven von ThoughtWorkern aus aller Welt.

  • Bücher

    Stöbern Sie durch unsere umfangreiche Bibliothek.

  • Podcasts

    Spannende Gespräche über das Neueste aus Business und Technologie.

  • Zurück
  • schließen   ✕
  • Übersicht
  • Bewerbungsprozess

    Finde heraus, was dich in unserem Bewerbungsprozess erwartet.

  • Hochschulabsovent*innen und Quereinsteiger*innen

    Dein Einstieg in die IT-Welt.

  • Stellenangebote

    Finde offene Stellen in deiner Region.

  • In Kontakt bleiben

    Abonniere unsere monatlichen Updates.

  • Zurück
  • schließen   ✕
  • Übersicht
  • Konferenzen und Events
  • Diversity und Inclusion
  • Neuigkeiten
  • Open Source
  • Management
  • Social Change
Blogs
Wählen Sie ein Thema
Alle Themen ansehenschließen
Technologie 
Agiles Projektmanagement Cloud Continuous Delivery  Data Science & Engineering Defending the Free Internet Evolutionäre Architekturen Experience Design IoT Sprachen, Tools & Frameworks Modernisierung bestehender Alt-Systeme Machine Learning & Artificial Intelligence Microservices Plattformen Sicherheit Software Testing Technologiestrategie 
Geschäft 
Financial Services Global Health Innovation Retail  Transformation 
Karriere 
Karriere Hacks Diversity und Inclusion Social Change 
Blogs

Themen

Thema auswählen
  • Technologie
    Technologie
  • Technologie Überblick
  • Agiles Projektmanagement
  • Cloud
  • Continuous Delivery
  • Data Science & Engineering
  • Defending the Free Internet
  • Evolutionäre Architekturen
  • Experience Design
  • IoT
  • Sprachen, Tools & Frameworks
  • Modernisierung bestehender Alt-Systeme
  • Machine Learning & Artificial Intelligence
  • Microservices
  • Plattformen
  • Sicherheit
  • Software Testing
  • Technologiestrategie
  • Geschäft
    Geschäft
  • Geschäft Überblick
  • Financial Services
  • Global Health
  • Innovation
  • Retail
  • Transformation
  • Karriere
    Karriere
  • Karriere Überblick
  • Karriere Hacks
  • Diversity und Inclusion
  • Social Change
Global HealthBangaloreSocial ChangeGeschäftKarriere

A Unique Production Issue at JSS

Arjun Khandelwal Arjun Khandelwal

Published: Apr 20, 2015

The first Bahmni implementation was done at the Jan Swasthya Sahyog (JSS), a hospital in rural Chhattisgarh in India. The implementation in this project has been ongoing for two years. I’ve recently become a part of this project to take on some of the responsibility of this implementation. In my first week on the project, I faced a very unique production issue. I call it unique because as an Application Developer, I had never encountered an issue like this before and more so because of the challenges that made it difficult to solve.

# What happened?

I was in Bangalore to understand context of the project from the team, when we got a call from JSS. They informed us that an electric surge had caused the EMR (Electronic Medical Record) application and the Internet to go down. While the Internet outage at JSS didn’t fall under  the ThoughtWorks project purview, the application did. For the benefit of those who aren’t aware, JSS is located at least a day away from any ThoughtWorks location in India.

# So, what did this mean?

Production being down is of the highest priority/severity for any client that we work for, irrespective of the domain. We take a downtime of even a few minutes very seriously. Depending upon the role that the application plays, its downtime could hurt the client in many ways - ranging from revenue loss to legal issues to reputation damage to loss of productivity. In this case, the application being down meant an increase in the wait time and distress of patients, with more than 50 percent of them suffering from chronic and serious illnesses like tuberculosis (TB), cancer and diabetes.

# Why was this a difficult problem to solve?

We had to get to the root of the problem. We started by speaking to the people who handle the IT systems at JSS. Troubleshooting turned out to be tricky and cumbersome because of two major reasons:

  • No Internet. This meant:
    • The option of logging in over VPN to troubleshoot was ruled out
    • Platforms like Email, Skype, Whatsapp which provide faster and better communication through file sharing (mainly images, in addition to text) were ruled out
  • The IT staff at JSS have limited knowledge of the application internals and UNIX-like systems, which meant:
    • All troubleshooting actions had to be explained step by step over a phone
    • All troubleshooting commands had to be dictated letter by letter. For instance, for them to key in "ls /etc/sysconfig/", we had to say “ls space slash etc slash sys tab <look at what is coming up> <select sysconfig> …”
    • The results expected from troubleshooting commands/actions had to be explained to them verbally

The above mentioned challenges increased resolution time.

Initially, only Mujir (a senior developer on the project with good knowledge of JSS and Bahmni) was looking into this. But after listening to the challenges he was facing, a couple of us decided to team up with him to find a solution to the problem. It was especially intriguing to me, given that I was going to take some responsibility of this implementation in a few weeks, working at the site. 

# What was done?

After many phone conversations which lasted a few hours, we started to look at hardware failures. We did the following:

  • Replace Network Interface Card (NIC)

We zeroed in on the NIC on the application server box as one of the probable issues. The NIC had probably burned out because of the power surge. Narain, who had some experience handling computer systems, had recently joined JSS and was working with us on the issue. He suggested that we replace the NIC card with the spare ones available to see if it fixed the problem. However, this was a server box and none of us had hands-on experience with server hardware modifications before. We hadn’t interacted with Narain much before and were unaware of his experience in handling this. We were conflicted about the option of supporting him remotely.We even started exploring the option of having someone from the project travel to JSS, preferably from the Hyderabad office as it is the closest office. But the earliest anyone could reach JSS was by the afternoon the next day.

While the logistics of a team member travelling to JSS was being sorted out, we deliberated and decided to go ahead with remotely supporting Narain to replace the NIC card as the immediate course of action. He seemed quite confident of being able to replace the card. After opening the box, Narain gave us the bad news that the card was integrated with the motherboard and couldn’t be replaced. What’s worse, the phone got disconnected. Even as we quickly started evaluating other options, we got a call back from Narain in a few minutes with the good news that he had fit the card into an external slot that was available.

  • Replace Hard Drive

The next task was to get the box onto the network. We were not able to get this done quickly as we were all fairly new to this exercise. The limited time that we had was slipping by. So, we proposed another solution. We had a master-slave setup at JSS for data replication with the application server being on the master hardware. Both the master and slave have the same hardware configuration. We decided to exchange the hard disk of the master with the slave so that the slave hardware be utilised as master.

  • Setup Network Interface

But, this meant that the MAC addresses mapping would change. We had landed with the same problem again. The network interface wasn’t working. This happens sometimes when troubleshooting. We stayed calm and went through the process again. After re-reading and exploring, we identified that we were missing one of the steps to create the  ifcfg-<network interface> file. By following all the steps correctly, we were able to bring the system up. Phew!

It had been almost 7 hours since we started looking into this issue. As we were working on getting the application up, the work to fix the internet was going on simultaneously. The internet was up by the time we got the application up. Thankfully, we could login to the VPN and verify what was done. The slave box was happily playing the role of master. However, the replication was broken as there was no box acting as slave. But by this time, all of us, including Narain at JSS, were confident of setting up the network interface, amongst other things. So we quickly fit the hard disk that we had taken out of slave box back into the master box and set up the Network Interface and IPs appropriately. This exchanged the role of master and slave played by the two hardwares, but everything was back up again. We verified if all was up in order before breathing a sigh of relief.

All’s well that ends well and this time, it taught us as well. Here are my learnings from this experience. 

# Learnings

  • Power Surge and Surge Protectors

I learnt about power surges and and surge protectors. Here’s what you should know. Power surges are short, fast spikes in the electricity being supplied to a power outlet. Many events can cause power surges, such as lightning strikes, power outages, short circuits, electromagnetic pulses, and turning large machines on or off which share the same power line. When a computer is plugged to an outlet or connected to a router via a cable, the computer is vulnerable to power surges. Power surges wreak havoc on the hardware of computers and can physically fry the network interface card.

Earlier, network cards were built on expansion cards and plugged into computer buses. But with the increased usage of network and internet, they now come built directly into the motherboard of computers, resulting in them being smaller, delicate and more vulnerable to such surges.

While nothing can guarantee absolute protection from a direct or very close lightning strike, computers and components like NIC can, to some extent, be protected from mild power surges by using surge protectors, which are devices designed to absorb the harmful effects of surges in electrical power. When a power surge happens, there is a probability that the surge protector might get destroyed. But it’s always easier and cheaper to replace a surge protector than a NIC.

  • Failover Mechanism

Simply put, a failover means having another redundant system to switch over to when the previously active system fails. We have it on all systems that we build.  In this case, even though data redundancy and backup were available, system failover was not implemented due to other priorities. But now with the system having reached enough usage and importance, we’ve learnt the lesson of prioritising the failover the hard way.

  • Steps to replace NIC card

The steps to replace the NIC card configured are not very difficult and can be found on the internet easily. They might vary slightly, depending on the operating system being used. While the steps are not very difficult, it’s good to know them before hand to avoid actions being taken in panic when a failure happens.

  • Remote Support Options

Dealing with support issues over the phone is very difficult. A simple software like AirDroid comes handy. Since it was cumbersome to type SMS’ on a phone, we started using AirDroid which allowed us to send the SMS that we had typed on our computer.

  • Learning while doing

Last but not the least, I got first hand context of the production system and challenges surrounding it.

To conclude, my observation has been that the challenges and learnings in a resource constrained setting and in remote environments, can be quite intriguing and unique compared to developing enterprise applications. If you are a technologist wanting to solve problems in such environments, then apart from application specific knowledge, the ability to deal with hardware and network issues, awareness of frugal solutions, like AirDroid we used in this case, and the ability to communicate and deal with users who are not very technically savvy, could turn out to be very necessary arrows in your quiver! It always fascinates me to see how doctors and engineers are collaborating on this project to improve health care for the underprivileged.

# Relevant Reading:

A few articles that I found worth a read after dealing with the issue

  • 9 Things to Do When Your Internal Network Card Stops Working
  • Consistent Network Device Naming

 

  • Unsere Services
  • Unsere Kunden
  • Insights
  • Karriere
  • Über uns
  • Kontakt

WeChat

×
QR code to ThoughtWorks China WeChat subscription account

Presseanfragen | Datenschutz | Impressum | Modern Slavery statement ThoughtWorks| Barrierefreies Webdesign | © 2021 ThoughtWorks, Inc.