External VS Internal solution: Reliability

Pydantic Top 20 devs - code ownership

External solutions often undergo rigorous testing and widespread use, making them more reliable. Internally developed solutions may lack the same level of scrutiny and real-world validation. Reliability is a cornerstone of effective software development. When considering the adoption of external solutions versus developing internal ones, the reliability factor often tips the scale in favor of established external tools.

Reliability: The Case for External Solutions

The reliability of external solutions often surpasses that of internally developed alternatives due to extensive production testing, ongoing iteration and improvement, and robust community support. While pride and the desire for custom solutions can drive NIH syndrome, the practical benefits of adopting proven external tools—especially their reliability—cannot be overlooked. By leveraging these external solutions, organizations can avoid the pitfalls of technical debt and inefficiency, focusing instead on innovation and value addition.

Extensive Production Testing

External solutions have typically been in use for a longer period, undergoing extensive testing in a variety of production environments.

Stress-Tested

External solutions are often used by a wide user base, ranging from small startups to large enterprises. This diverse usage ensures that the software is stress-tested under different scenarios and workloads, revealing potential weaknesses that might not surface in a limited internal testing environment.

Validated by Experience

The experiences of thousands or even millions of users help in identifying and resolving edge cases and obscure bugs. This level of real-world validation is hard to replicate with an internally developed solution, which might only be used by a limited number of users within the company.

User Feedback Loop

With a large user base, external solutions benefit from continuous feedback. Users report bugs, suggest features, and share performance insights, creating a feedback loop that drives continuous improvement. This iterative process ensures the software becomes more reliable over time and is highly valuable for users of such solutions.

Iteration and Improvement

Established external solutions benefit from multiple iterations and a continuous improvement process. Companies love Agile methodologies, sprints, and other rituals. If we take a closer look at external solutions, they often have already undergone hundreds of thousands of iterations, making it hard for internal projects to compete.

Regular Updates

Developers of popular external tools regularly release updates that include bug fixes, performance enhancements, and new features. These updates are often driven by user feedback and emerging industry standards, ensuring the software remains robust and relevant.

For example, the data validation library “Pydantic” releases updates every few weeks to every month (pydantic/releases).

Performance and Memory Optimizations

Through ongoing optimization efforts, external solutions become more efficient. Performance bottlenecks are identified and addressed, and memory usage is optimized, resulting in a more stable and efficient product. Achieving similar levels of optimization with an internal solution would require significant time and resources.

Community and Ecosystem Support

The broader community around external solutions contributes significantly to their reliability.

Community Contributions

Open-source and popular proprietary solutions often have active communities that contribute code, identify issues, and provide patches. This collective effort leads to a more resilient and feature-rich product.

To visualize the evolution of a codebase over time and the contributions made by the community, tools like Hercules can be very valuable. Hercules analyzes the repository history and can generates detailed graphs showing the survival of code lines. These visualizations help developers and project managers understand how community contributions have influenced the codebase over time.

SQLAlchemy line burndown. Generated with `hercules –burndown –first-parent –pb https://github.com/sqlalchemy/sqlalchemy | labours -f pb –resample year -m burndown-project`

Comprehensive Documentation

External solutions usually come with extensive documentation, created and refined over time. This documentation helps users understand the intricacies of the tool, troubleshoot issues, and implement best practices, enhancing the overall reliability of the solution.

Internal Development Challenges

In contrast, internally developed solutions face several challenges that can compromise reliability.

Limited Testing

Internal solutions are typically tested by a smaller group of users in a controlled environment. This limited scope can fail to uncover bugs and performance issues that would become apparent under wider usage.

Resource Constraints

Developing a reliable, high-performance solution internally requires significant resources, including time, skilled personnel, and financial investment. Often, these resources are constrained, leading to compromises in the quality and reliability of the software.

Lack of Continuous Improvement

Unlike external solutions, which are continuously improved based on widespread feedback, internal solutions might not receive the same level of iterative enhancement. This can result in stagnation and technical debt over time.

Nuances

While the case for external solutions hinges on their reliability, it’s important to recognize that not all external libraries, packages, or frameworks are equal. The decision to adopt external tools must consider the specific nuances and individual characteristics of each option.

Extensive Production Testing

Not every external solution undergoes the same level of production testing. Some might be relatively new or niche, lacking the extensive stress-testing seen with more established options. It’s essential to evaluate the maturity and user base of an external tool to gauge its reliability accurately. Due diligence involves assessing how well the solution has been tested in environments similar to your own.

Iteration and Improvement

The frequency and quality of updates can vary significantly between external solutions. While some tools, like the data validation library “Pydantic,” release updates frequently to address bugs and enhance performance, others may have slower development cycles. Understanding the development lifecycle and commitment to improvement from the maintainers of an external tool is crucial in determining its long-term reliability.

Community and Ecosystem Support

The strength of community support can differ widely. Highly popular tools may have robust community contributions and extensive documentation, while less-known solutions might lack these benefits. When evaluating external solutions, consider the size and activity level of the community, as well as the quality and availability of documentation. This will help ensure that the tool you choose is supported by a vibrant ecosystem that can assist with troubleshooting and development.

Also, not everyone works the same way. When we examine the code ownership of Numpy, we observe something that might seem very strange: Eric Wieser became the owner of approximately 40% of the codebase in just one month, which is roughly equivalent to 250,000 lines of code.

How did he achieve this? By updating and reorganizing LAPACK in the Numpy codebase with these three pull requests:

Due to these updates, the graph representing the Numpy code ownership is not entirely accurate but can still provide valuable insights, such as the proportion of code that is “generated” in Numpy (at least ~40%). Knowing this, the trust should not be attributed to Eric Wieser but rather to f2c, a program that translates Fortran into C.

Internal Development Challenges

Although internal development often faces limited testing, resource constraints, and a slower pace of continuous improvement, there can be exceptions. For instance, a company with dedicated resources and expertise might successfully develop a reliable internal solution tailored precisely to its needs. However, this is usually the exception rather than the rule, and internal projects must be critically assessed for their potential to overcome these common challenges.

Leave a Reply

Discover more from The Way of Python

Subscribe now to keep reading and get access to the full archive.

Continue reading