Tuesday, January 19, 2010

Agile ADO.Net Persistence Layer: Part 3 Service Class Single<DTO> Data Access Method

When I say data access methods, I’m talking about the methods my UI is going to call whenever it needs to get data.  When my Posts controller needs a list of BlogPosts to display, it’s going to call a BAL method like GetAllBlogPosts() or GetAllBlogPostsForCategory().  Last time I mentioned (over and over) that I like to keep things simple.  When I need to get or save data, I don’t want to have to search through 3 different classes just to find the one with the method I need.  Instead, I’m putting all my persistence logic for a given aggregate in just one place, a service class.  This is not a web service.  I’m using service in the Domain Driven Design sense here. That means that I have a BlogService class that is my one stop shop for all persistence that has to to with Blogs, BlogPosts, SubmittedBlogUrls, and anything else that falls within the Blog aggregate. Here is what my BlogService class looks like.  You can see that it’s mostly “Get” data access methods.

image

What’s an Aggregate?

I keep using the word aggregate.  If you’re not familiar with the term, it just means a group of entities that all share the same persistence class (whether that be a repository, a service, or something else).  This is a key concept in Domain Driven Design. If you want to know more I would recommend picking up Eric Evans’ book or Jimmy Nilsson’s book on DDD.  For now, all you need to know is that a BlogPost can never exist without a Blog, so there’s no point in BlogPost having it’s own persistence class.  In fact we find that if we do break BlogPost out into it’s own persistence class, it will lead to problems down the road due to BlogPost’s dependency on Blog.  What’s the solution?  We put data access methods for both Blog and BlogPost in the same persistence class and call it an aggregate.  That is why BlogService has methods for both Blog and BlogPost entities.

What type of data will data access methods return?

We covered this last post, but to recap all data will be returned as a Data Transfer Object (DTO).  The DTOs are all defined in our Common assembly in the DataShapes folder.  Our BAL will return data in one of the following 4 formats.

  • a single DTO
  • a List<DTO>
  • a DataPage<DTO>
  • a string value

For more see last week’s post Agile ADO.Net Persistence Layer: Part 2 Use DTOs.

A simple Single<DTO> data access method

Let’s look at the simplest possible data access method.  GetBlogPost() takes a postGuid for a parameter, defines the query to find the BlogPost entity for that postGuid, and then returns the result as a single BlogPost DTO.  Here’s the complete method.

public BlogPost GetBlogPost(Guid postGuid)

{

    string query = @"SELECT p.*, s.Score

                    FROM [dbo].[BlogPost] p

                    LEFT JOIN [dbo].[BlogPostReputationScore] s on s.PostGuid = p.PostGuid

                    WHERE PostGuid = @PostGuid";

    SqlDao dao = new SqlDao();

    SqlCommand command = dao.GetSqlCommand(query);

    command.Parameters.Add(dao.CreateParameter("@PostGuid", postGuid));

    return dao.GetSingle<BlogPost>(command);

}

The first thing you’ll notice is that this isn’t a lot of code.  All we’re really doing here is defining a parameterized TSQL query, wrapping that query up in a SqlCommand, and then passing the command and our desired return type off to a Data Access Object (DAO) that automagically executes the command and maps the results to our desired type.  It may seem counter intuitive to write code like this when we haven’t even written the DAO yet, but that’s exactly how I did it when I wrote this code for the very first time.  I decided that my data access methods should be very simple.  I would start with the query and the DTO type that I wanted it to return, then I would pass them both to some type of helper class that would handle the details of running the query and figuring out how to map the query results to the properties of my DTO.  By using this top down approach, I gave myself a very clear picture of how I needed my DAO to behave.

What’s a DAO (Data Access Object)?

By looking at the query logic above, you can see that I have this thing called a DAO or Data Access Object.  This is a class that encapsulates the helper logic for working with my database.  The DAO handles things like creating parameters, getting a connection, and most importantly it implements methods to return my four main data formats, GetSingle<DTO>, GetList<DTO>, GetDataPage<DTO>, and GetStringValue(). The DAO and it’s associated DataMappers are where you’ll find the special sauce that makes this architecture work.  We’ll get into their implementation later on.

A BAL that embraces change

It’s easy to look at the simple code above and miss something that I think is very important.  In fact that thing is the whole reason that I wrote this framework.  That simple data access method above is the blueprint for a flexible persistence layer that makes changing your entities and associated persistence code easy and almost painless.  It sets up a simple 3 step process for all data access in your application.

  1. Define a DTO in the exact data shape that you’re looking for.  That means create a DTO property for each data field that you need out of the database.
  2. Define a query that gets the data.  It can be as simple or as complex as you like.  You can develop it in Sql Server Management Studio.  You can easily optimize it.  Use whatever process or tools work for you. When you’re done just paste the query into your data access method.
  3. Pass both the query and your DTO to the DAO and it will automatically handle field mappings and pass the results back in the data shape you requested.

This is a very powerful way to work.  I can’t count the number of times that I’ve worked with and architecture where I dreaded any changes because I knew that any data fields added would require me to modify a sproc, a DAL method, a BAL method, parsing logic, an entity class, it all adds up to a lot of friction that resists any change.  This BAL design embraces change.  It’s written with the attitude that we know change is going to happen so we’re going to give you as few things as possible to modify, and make sure we don’t have any cross cutting dependencies, so that you can make changes easily.

Next time, more on the service classes.

Next Post: Agile ADO.Net Persistence Layer Part 4: Writing data access for new data shapes

11 comments:

  1. I know that you are trying to keep your post simple but the BlogService has way too many methods on it. I'd suggest breaking up the service (aka. using SOLID principles) into smaller business related groups.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. I'm a big fan of Robert Martin and SOLID, but I think it's been taken too far. For the majority of projects I think it is much better to group your persistence logic and business logic together in a single service class. My opinion is that the single responsibility principle makes a lot of sense when applied at the method level, but should not always be used at the class level. I think applying SOLID at the class level for something like a service class is an example of us just following a practice because we think it's the architecturally right way to do it, but in reality it just makes our code harder to maintain. Some classes exist to be the one stop shop for mulitple distinct methods that provide a certain category of functionality. I don't think SOLID should be applied to them.

    ReplyDelete
  4. This really is one messed up implementation and I hope that any newbies do not fall into the trap of thinking this is good advice.

    What I see here is lots of code with little benefit. There is no type safety for queries and the design does not facilitate support for query composition. This makes it error-prone, hard to maintain and bothersome to write in the first place.

    For a proper architecture please investigate the Repository/Specification patterns. For examples of how to apply these patterns in combination with an ORM and LINQ, go look at the LightSpeed ORM tutorials. It's a much more sound architecture and you'll be sooooo much more productive.

    ReplyDelete
  5. Morten, thank you for your concern but I don't think you have to worry. I don't get many newbies reading this blog. I do get a lot of very senior developers though.

    You raise one good point about the lack of type safety in the queries. Obviously the DTOs that come out of the data access methods are type safe, so I assume your concerns are about the fact that our query logic is a simple string, not a query object or queryable entities like Linq. I don't really see this as a problem for one reason, you should always have covering tests for your data access methods. A simple crud test that calls your data access methods will quickly expose any errors in your queries. Also, your process should be to develop the query in SSMS and just copy and paste it into your data access method. That cuts back on a lot of the opportunity for error also.

    I would also say that the covering tests are necessary even if you're using LINQ. LINQ goes a long way toward making sure you get field names correct, but as anyone who has done any real development with LINQ knows, the deferred query execution means that it is still all to easy to create a query with errors that won't show up until run time.

    ReplyDelete
  6. I'd tend to disagree with this advice also. You shouldn't write tests to ensure that your code generator works - instead, those should be part of the code base for the code generator itself. As such, you should not write tests to assert that the LINQ provider generates valid SQL. However, you obviously should write tests for any non-obvious queries (possibly even all non-CRUD queries), but even so will have many fewer tests than if you add them for the entire data layer. Yet another man-month saved on larger projects.

    Another critical point is that these tests should only be run on your build server and not by individual developers. Unit tests need to execute fast and without dependencies, which is certainly not the case for anything that needs a matching database. I therefore think it unwise to choose an approach where the compiler is unable to help you validate your work. You seem to like SSMS but I find myself much more productive without having to switch between tools.

    Additionally, when using string literals you lose the ability to use query composition. You cannot take an existing query and append additional restrictions, sorting or similar to it (unless you think StringBuilder is a great solution for this) and therefore need many, many more methods if you want the same flexibility (for instance, allowing users to configure custom filters and sort options).

    I therefore maintain that it is poor advice your are handing out here. The time required to write up and maintain all of this boiler-plate code and SQL would be much better if invested in additional unit tests, improved architecture, product features, or at the very least something that a tool isn't able to automate for you.

    If you insist on doing development the hard, old-fashioned way the least you can do is to use a tool like MyGeneration to auto-generate the data-layer.

    ReplyDelete
  7. Hi Morton, thanks for being a little more civil. You don't have to agree with this approach. A lot of my friends in the Denver/Boulder .Net community don't agree with it. I do think it's funny that you keep saying that it's inefficient because the whole reason that I'm sharing it is that it's an extremely efficient way to work and it has drastically reduced the amount of time I spend fiddling with query logic and custom data shapes.

    Re testing data access methods, it's just a really good idea. It's very possible for your LINQ query to compile, but then throw an error at run time. The problem isn't with LINQ, it's with your query. Most of the errors I see come from people trying to do some data shaping with a method call that compiles but isn't valid inside the LINQ query. It takes no time to write a quick automated integration test the goes through the CRUD methods. See http://rlacovara.blogspot.com/2009/12/how-do-you-test-crud-methods-just-hit.html

    ReplyDelete
  8. Apologies for my first post, I can see that it was somewhat crude.

    I'm still totally unconvinced of how you can be "extremely efficient" when you need to hand-code many of the things that you'll get for free when using an ORM.

    Just look at the code in the link you posted and you'll see lots of code repetition (setting parameters, executing queries, etc.) in addition to all of the hand-written SQL.

    You might be the fastest snail on the block, but that's still not very fast ;)

    Back when I gave up this approach in favor of using an ORM I found that on average I saved about 30% of the total development time, which is just a huge amount of time saved on every project. And time saved is a penny earned.

    ReplyDelete
  9. Hi Morten, No problem and thank you for the comments. I want to hear from people who don't agree, especially if they have something new to bring to the table.

    ReplyDelete
  10. Personally, I like the straight-forwardness of the design. It embodies elements of conclusions I've come to regarding various tried approaches. Sure there are ORMs and LINQ, but I don't really like these "automagic" tools. But I don't even like MS-generated typed datasets, either. I guess I'm old-fashioned. Surely, these tools add another layer between direct ADO and your business code, and there's got to be overhead there. When something goes wrong, I think having more layers and frameworks/tools in the way makes things more complicated in the long run. Once this is written and debugged, your done, right? (At least until the change requests start coming). So what's the big deal?

    ReplyDelete
  11. Hi Rudy,

    I'm using this approach (modified) for 2 new projects, I have stored procs which create my persister and DAO classes from the database, so once I've created my database tables, the vast majority of the "work" Morten talks about is actually done for me.

    I find that there are a few issues with this approach, more complex objects are harder to load, as it requires additional work. However having been a developer and architect for a great many years, working on systems using ORM's etc, I have found that they create code which almost always requires tweeking, which can be forgotten when you make changes month down the line, which causes faults and issues, meaning that defects arise due to the automated code creation. From experience, I much prefer having full control over code, rather than having code which regenerates and creates issues.

    I find your view's and idea's very refreshing!

    Davoc

    ReplyDelete