Cleaner Code, Happier ArchUnit: Extracting Nested Data Classes
Hey everyone! Let's talk about something super important for keeping our codebase sparkling clean and our development process smooth: extracting nested data classes. You know, those little bits of technical debt that sometimes creep into our projects, making our architecture rules scream a little. We're on a mission to iron out some ArchUnit temporary exclusions and make our code truly shine. This isn't just about satisfying an automated test; it's about making our code more readable, maintainable, and ultimately, a joy to work with for everyone on the team.
Why Nested Data Classes Are an Architectural Headache
So, what's the big deal with nested data classes, you might ask? Well, guys, our ArchitectureTest.kt is currently carrying a temporaryExclusions set, which is basically a little note saying, "Hey, ignore these files for now, we know they break a rule." Specifically, it's bypassing the "one top-level class per file" architectural rule. This rule is a cornerstone of clean code practices, and for good reason. Imagine walking into a new codebase (or even your own after a long weekend!), and you're trying to find a specific data structure. If it's tucked away inside another service class, perhaps deeply nested, it's like trying to find a needle in a haystack. Readability takes a hit, and understanding the system's various components becomes unnecessarily difficult. When we keep data structures separate from the logic that operates on them, we create a much clearer separation of concerns. This design philosophy dramatically improves maintainability, making it easier to pinpoint issues, refactor safely, and onboard new team members. Violating this rule often leads to a tangled mess, where a single file can balloon into an unmanageable behemoth, hiding multiple concerns under one roof. By addressing these architectural shortcomings, we're not just fixing a linter warning; we're investing in the future health and scalability of our entire application.
Furthermore, having nested data classes means that if another part of the system needs to use that data structure, it has to import the entire outer class. This can lead to tighter coupling than necessary, where changes to the enclosing service might inadvertently affect other parts of the system that merely depend on its nested data structure. Think about it: a data class is essentially a definition of information. It's like a blueprint. Do you really want your blueprint for a small component to be embedded within the blueprint for an entire factory? Probably not! You want a clear, standalone blueprint that can be easily referenced and understood. This principle extends directly to our code. Our goal is to promote loose coupling and high cohesion, two fundamental pillars of robust software design. Separating these concerns makes our system more modular, allowing for easier testing, independent development, and clearer responsibilities. It's all about making our code more predictable and less prone to unexpected side effects, ultimately fostering a more resilient and adaptable system.
Our Current Situation: The ArchUnit Exclusion List
Right now, we've got a bit of a technical debt situation on our hands. The ArchitectureTest.kt file, our architectural guardian, has a temporaryExclusions set that is currently turning a blind eye to five specific service files. These files, listed below, are flagged for containing nested data classes that violate our "one top-level class per file" rule. While these exclusions allowed us to move forward quickly at some point, they represent a lingering architectural inconsistency that we need to tackle head-on. It's like having a leaky faucet that you've put a bucket under – it works for now, but it's not the permanent, clean solution we want.
InvestmentMetricsService.ktEtfBreakdownService.ktCalculationService.ktInstrumentService.ktLightyearHoldingsService.kt
These are important services, guys, and having them on an exclusion list means we're consciously allowing a deviation from our desired architectural standards. Over time, these small deviations can accumulate, making it harder to enforce rules consistently and understand the codebase's true structure. Removing these exclusions isn't just about passing a test; it's about elevating the quality of these core services and ensuring they align with our long-term vision for a clean, maintainable, and robust system. It's about saying goodbye to temporary fixes and embracing sustainable code health.
Identifying the Culprits: Nested Data Classes to Extract
Alright, let's get down to brass tacks and identify the specific nested data classes that are causing our ArchUnit tests to grumble. We've categorized them by priority, because, well, some issues are just a bit more urgent than others! The aim here is to pull these data structures out of their cozy nests within our service files and give them their own dedicated homes. This not only satisfies ArchUnit but, more importantly, makes our code much clearer and easier to manage. Think of it as decluttering a messy room – everything gets its proper place!
High Priority: Public Classes
These are the big ones, folks. The public nested data classes are a higher priority because they are often exposed or used in crucial interfaces, caches, or asynchronous operations. When a public class is nested, it means other parts of our application (or even external systems, in some cases) might be directly interacting with this nested structure. This creates a stronger coupling between the consumer and the containing service, which is generally something we want to avoid. Extracting these makes their contracts clearer and their usage more explicit, reducing potential side effects and improving overall system clarity. It’s like ensuring all your important public signs are standalone and easy to read, not hidden within a larger business report.
Let's dive into the specifics:
-
CalculationResultinCalculationService.kt: This data class, currently residing withinCalculationService.kt, is a prime candidate for extraction. It spans lines 21-30 and, crucially, implementsSerializable. ThisSerializableinterface isn't just for show; it's a vital component that enables objects of this class to be converted into a stream of bytes, which is then often used for caching mechanisms (like Redis) or for transmitting data across networks. WhenCalculationResultis nested, it implicitly ties its lifecycle and its definition directly toCalculationService. While functional, this creates a less flexible and less discoverable data structure. ExtractingCalculationResult.ktinto its own file means that any component needing to store or retrieve calculation results from a cache can directly importCalculationResultwithout having to bring inCalculationServiceitself. This significantly reduces unnecessary dependencies and clarifies the data's independent role. Moreover, having a standaloneCalculationResultimproves code readability immensely. A developer looking to understand the output of a calculation service can immediately find its definition without scanning through service logic. This separation is a cornerstone of building highly maintainable and understandable applications, allowing us to evolve our caching strategies or calculation logic independently without fear of inadvertently breaking serialization or external consumers. This move ensures that our cache compatibility remains intact while boosting architectural clarity. -
XirrCalculationResultinCalculationService.kt: Also found withinCalculationService.kt, from lines 32-37,XirrCalculationResultis another public class that needs its own space. The note explicitly states it's returned from an async method. This means it's likely part of an asynchronous contract, potentially forming the payload of aFutureorCompletableFuture. When a class is used as a return type for such methods, its definition becomes a critical part of the API contract. Nesting it withinCalculationServiceobscures this contract, making it harder for consumers of the async method to immediately grasp the structure of the returned data. By movingXirrCalculationResultto its ownXirrCalculationResult.ktfile, we explicitly declare its role as a standalone data transfer object (DTO). This dramatically enhances the discoverability and clarity of our asynchronous APIs. Developers consuming these results can instantly locate the definition, understand its fields, and integrate it into their code with greater confidence. This refactoring doesn't just make ArchUnit happy; it makes our API contracts more robust and transparent, leading to less confusion and smoother integration points across our services. It's about providing a clear, unambiguous data blueprint for our async operations. -
InstrumentEnrichmentContextinInstrumentService.kt: This data class, located inInstrumentService.kt(lines 24-28), is described as public and encapsulating enrichment parameters. This is a classic case where a data class acts as a dedicated container for a set of related inputs or parameters for a specific operation. When such a context class is nested, it forces any client code that needs to provide these enrichment parameters to depend directly onInstrumentService. This creates an artificial coupling between the client and the service, even if the client only cares about the context object itself. ExtractingInstrumentEnrichmentContextintoInstrumentEnrichmentContext.ktbreaks this unnecessary bond. Now, clients can simply importInstrumentEnrichmentContextand construct it independently, then pass it toInstrumentServicemethods. This architectural change promotes better modularity and reusability. It means the definition of enrichment parameters is clear, independent, and easily discoverable. Furthermore, if the enrichment process itself were to evolve or be moved, theInstrumentEnrichmentContextcould follow or be adapted without draggingInstrumentServicealong for the ride. This empowers us to design more flexible and adaptable systems, where core data definitions stand on their own. -
HoldingKeyinEtfBreakdownService.kt: Residing inEtfBreakdownService.kt(lines 29-33),HoldingKeyis explicitly used as a Map key. This is a crucial detail because objects used as keys in maps (especially when dealing with custom objects) rely heavily on properequals()andhashCode()implementations to ensure correct behavior. WhenHoldingKeyis nested, its significance as a unique identifier is somewhat understated. By givingHoldingKeyits own file,HoldingKey.kt, we elevate its status to a first-class data structure, emphasizing its role in uniquely identifying holdings. This separation makes it much clearer to developers that this is a fundamental building block for data organization. It also ensures that any future changes or considerations regarding how holdings are uniquely identified can be focused solely onHoldingKey.kt, rather than being intertwined with the business logic ofEtfBreakdownService. This promotes a cleaner design where data definitions are distinct from the operations performed on that data, improving both readability and maintainability for everyone working with the ETF breakdown functionality. -
HoldingValueinEtfBreakdownService.kt: Immediately followingHoldingKeyinEtfBreakdownService.kt(lines 35-39),HoldingValueserves as a Map value. Paired withHoldingKey, this data class represents the actual data associated with each unique holding. Just like its key counterpart, nestingHoldingValuewithin the service makes it less discoverable and couples its definition unnecessarily with the service's implementation details. ExtractingHoldingValueto its ownHoldingValue.ktfile provides several benefits. Firstly, it visually clarifies thatHoldingValueis a structured piece of data that describes a holding, independent of theEtfBreakdownServicelogic. Secondly, if the structure ofHoldingValueneeds to evolve (e.g., adding new fields or modifying existing ones), these changes can be made in its dedicated file without impacting theEtfBreakdownService's core functionality or forcing a recompilation of the service if only the data structure is modified. This separation fosters independent evolution of both data and logic. It’s a huge win for maintainability and allows for a clearer understanding of the data model underpinning our ETF breakdown features, ensuring that our data definitions are as robust and flexible as the services that use them.
Medium Priority: Private Classes
While these private nested data classes aren't exposed publicly, extracting them is still a significant step towards a cleaner architecture. Even private classes contribute to a file's complexity and can make testing more challenging if they’re deeply embedded. By pulling them out, we reduce the cognitive load when reading the main service file, making it easier to grasp its core responsibilities. Think of it as tidying up your private drawers – even if no one else sees them, a well-organized space just feels better and makes it easier for you to find what you need!
Let's get into these:
-
HoldingsAccumulatorinInvestmentMetricsService.kt: This class (lines 47-64) is described as private and used in a fold operation. AHoldingsAccumulatorsounds like a helper class specifically designed to aggregate or process data during a reduction operation, often within a stream or collection. While its private nature means it's not exposed externally, having a relatively substantial class (17 lines!) nested withinInvestmentMetricsService.ktadds to the file's overall length and complexity. ExtractingHoldingsAccumulatorinto its ownHoldingsAccumulator.ktfile immediately improves the readability ofInvestmentMetricsService.kt. The service file can then focus purely on its high-level business logic, delegating the accumulation details to a dedicated, clearly named data class. This makes it easier to understand theInvestmentMetricsService's main purpose without getting bogged down in the intricacies of how data is accumulated. It also means if we ever need to test the accumulator logic in isolation or reuse it in a similar context, it's readily available as a standalone component. This refactoring contributes significantly to decoupling internal concerns and achieving a more modular and focused design within our services. -
ValidatedRowDatainLightyearHoldingsService.kt: Located inLightyearHoldingsService.kt(lines 90-95),ValidatedRowDatais a private parsing helper. Parsing often involves intermediate data structures that hold partially processed or validated information. Nesting such a helper class within the main service file, especially when parsing logic can be complex, adds noise and reduces the clarity of the service's primary functions. By extractingValidatedRowDatainto its ownValidatedRowData.ktfile under a logical package likeee.tenman.portfolio.lightyear/, we achieve several benefits. Firstly, it makes the parsing process itself more transparent and modular. Developers can easily find the definition of what constitutes "validated row data." Secondly, it cleans upLightyearHoldingsService.kt, allowing it to concentrate on orchestrating the overall holdings processing rather than defining internal data structures for parsing. This separation makes the parsing logic more testable in isolation and enhances the overall readability of the service, ensuring that our internal helpers are as well-organized as our public interfaces. -
BasicRowDatainLightyearHoldingsService.kt: Also a private parsing helper inLightyearHoldingsService.kt(lines 122-125),BasicRowDatalikely serves a similar purpose toValidatedRowDatabut perhaps at an earlier stage of the parsing pipeline. Even though it's a smaller class, its presence as a nested entity still detracts from the service's core focus. ExtractingBasicRowDataintoBasicRowData.kt(again, within theee.tenman.portfolio.lightyear/package) contributes to the same goals asValidatedRowData: improved modularity, enhanced readability, and clearer separation of concerns within the parsing logic. Having bothValidatedRowDataandBasicRowDatain their dedicated files helps to paint a clearer picture of the different stages and types of data involved in theLightyearHoldingsService's parsing responsibilities. This approach ensures that even internal helper structures are properly organized, making our codebase more robust and easier to navigate for anyone digging into the details of data ingestion and processing.
The Solution: A Path to Architectural Purity
Now that we've pinpointed the areas needing attention, let's talk about the solution. The path to architectural purity here is straightforward yet impactful: give each of these data classes its own dedicated file. This simple act dramatically improves our codebase's structure and makes ArchUnit, our architectural watchdog, much happier. By adhering to the "one top-level class per file" rule, we unlock a host of benefits that go beyond just passing tests; we build a more resilient, understandable, and scalable application. It's about empowering developers with a clear and consistent mental model of where everything lives.
New Files to Create
To achieve this, we'll be creating new files in a logical and organized manner. This ensures that our newly independent data classes are easily discoverable and fit naturally within our existing package structure. Following established conventions helps maintain clarity and consistency across the entire project.
src/main/kotlin/ee/tenman/portfolio/service/
├── CalculationResult.kt
├── XirrCalculationResult.kt
├── InstrumentEnrichmentContext.kt
├── HoldingKey.kt
├── HoldingValue.kt
└── HoldingsAccumulator.kt
src/main/kotlin/ee/tenman/portfolio/lightyear/
├── ValidatedRowData.kt
└── BasicRowData.kt
This structure reflects a clear separation, placing general service-related data classes in a common service package and Lightyear-specific parsing helpers in their designated lightyear package. This organization is key to future maintainability and helps new team members quickly grasp the project's layout.
Learning from the Best: Existing Patterns to Follow
Good news, folks! We're not reinventing the wheel here. Our codebase already has excellent examples of how properly extracted data classes should look and function. These existing patterns serve as a blueprint and confirmation that this refactoring aligns perfectly with our established best practices. By following these examples, we ensure consistency and leverage patterns that have already proven their worth. This makes the transition smooth and natural for everyone involved, reducing cognitive load and accelerating adoption.
We already have standalone data classes like:
InstrumentMetrics.ktPortfolioMetrics.ktTransactionState.ktPriceChange.ktInternalHoldingData.kt
These examples clearly demonstrate the benefits of dedicated files for data structures, making them easily discoverable, reusable, and testable. We're simply extending this proven approach to address our current technical debt, bringing more of our data definitions into the light.
What Success Looks Like: Acceptance Criteria
To ensure we've fully tackled this architectural task, we've laid out clear acceptance criteria. These points will guide our work and confirm that the refactoring is complete, robust, and doesn't introduce any regressions. Achieving these criteria means we've successfully cleaned up our code and reinforced our architectural standards.
- [ ] Extract all 9 nested data classes to separate files: This is the core task. Every identified nested class must get its own
.ktfile. - [ ] Update imports in affected service files: Once classes are moved, all references in
InvestmentMetricsService.kt,EtfBreakdownService.kt,CalculationService.kt,InstrumentService.kt, andLightyearHoldingsService.ktmust be updated to reflect the new file locations. - [ ] Remove
temporaryExclusionsset fromArchitectureTest.kt: This is the ultimate goal. TheArchitectureTestshould pass without any special waivers for these files. - [ ] All existing tests pass: No regressions! Our unit, integration, and end-to-end tests must all pass with flying colors, ensuring functionality remains unchanged.
- [ ] ArchUnit architecture tests pass without exclusions: The final validation that our architecture is clean and compliant.
Navigating the Waters: Risk Assessment
Every refactoring, no matter how beneficial, comes with potential risks. However, we've carefully assessed the situation, and the good news is that for this specific task, the risks appear to be minimal and well-understood. Our analysis suggests that this refactoring is largely a structural change with a low probability of introducing functional bugs, thanks to the nature of data classes and the clear separation of concerns we are aiming for. We’ve considered the most common pitfalls and have confidence in our approach.
- Serialization:
CalculationResulthas aserialVersionUID. This is super important because it ensures that objects serialized with an older version of the class can still be deserialized successfully with a newer one. The good news? Extraction is safe. Moving the class to its own file doesn't change its internal structure or howserialVersionUIDfunctions. We'll simply be moving the definition, preserving its serialization compatibility. This means our caches and any persistence mechanisms relying on serialization will continue to work flawlessly. - Circular Dependencies: A common headache in refactoring is creating circular dependencies, where Class A depends on Class B, and Class B depends back on Class A. This can lead to compilation issues and complex, tightly coupled systems. Our assessment: None identified. All the data classes we're extracting are pure Data Transfer Objects (DTOs). They simply hold data and don't typically have business logic that would create dependencies back to the services that use them. This pure DTO nature makes the extraction process very low risk in terms of dependency cycles.
- Cache Compatibility: We rely on caching for performance, and breaking that would be a nightmare! The fantastic news is that Redis cache keys remain unchanged. Our refactoring focuses on moving the definition of the data structures, not how they are used as keys or values in the cache. The actual data being stored and the keys used to retrieve it will be identical post-refactoring. This ensures no breaking changes to our caching layer, keeping our application speedy and efficient.
Why This Matters: The Big Picture
This refactoring isn't just about ticking boxes or silencing an ArchUnit warning, folks. It's about a much bigger picture – the long-term health, maintainability, and scalability of our entire portfolio application. Every line of code we write, every structure we define, contributes to the overall robustness (or fragility) of our system. By meticulously extracting these nested data classes, we're making a significant investment in our future.
Firstly, improved readability is a huge win. When data classes live in their own files, developers can instantly understand what data structures are available and what purpose they serve, without having to dig through service logic. This drastically reduces cognitive load, making it easier for new team members to onboard and for existing ones to navigate the codebase efficiently. Imagine a library where every book has its own distinct title and cover, rather than being a chapter hidden inside a larger, unrelated tome. That's the clarity we're aiming for.
Secondly, we're promoting better maintainability. Independent data classes mean that if the structure of CalculationResult needs to change, we modify only CalculationResult.kt. We don't risk inadvertently affecting CalculationService's logic unless it's directly related. This surgical precision in changes reduces the chance of introducing bugs and speeds up development cycles. It's about creating smaller, more manageable units that are easier to reason about and evolve.
Thirdly, this move enhances testability. When a data class is extracted, it becomes easier to create instances of it for unit testing purposes, both for the data class itself (if it has complex constructors or methods) and for the services that consume it. This leads to more focused and effective tests, which in turn leads to a more reliable application. We can ensure our data structures behave exactly as expected, in isolation, before integrating them into larger systems.
Finally, and perhaps most importantly, we are fortifying our architectural integrity. By removing temporaryExclusions from ArchitectureTest.kt, we are upholding our commitment to strong architectural principles. This consistency prevents architectural erosion, where small deviations eventually lead to a fragmented and difficult-to-manage system. It fosters a culture of discipline and attention to detail, which is invaluable for any growing project. A strong architectural foundation means our application is more resilient to change, easier to scale, and more predictable in its behavior. This refactoring is a critical step in ensuring our portfolio application remains a stellar example of well-crafted software.
Final Thoughts
So there you have it, folks! This refactoring task might seem small, but its impact on our codebase's health and our development efficiency is going to be huge. Let's get these nested data classes extracted, clean up those ArchUnit exclusions, and enjoy a more modular, readable, and maintainable portfolio application. Thanks for jumping on board and helping us make our code even better!