A new personal project
After coming to the disappointing conclusion that I cannot, at this time, pursue building my NAS device from scratch, I've turned toward another long contemplated and frequently revisited project. For quite some time I've dabbled with various Java based functionality, such as db4o, Scala and RCP frameworks, and have been toying with some ideas to implement a useful application from scratch.
I know it is pretty sad that a professional software developer of 7 years is psyched about creating an application from scratch, but most of my development work involves configuring and/or extending existing applications purchased from third party vendors, identifying and fixing bugs in code others have written, addressing performance issues, and interfacing multiple enterprise applications. I have done very little UI development, and most of that has been for web applications.
Standing at the start of my desktop application exploration, I realize my first goal is to implement something simple and useful with an expandable design. The tool needs to be useful to keep my interest, otherwise the application will be yet another academic exercise. The simple but useful feature provides the bait that leads me along, but by having an extensible design, adding additional features becomes incremental growth instead of titanic rewrites and redesigns. So, I have the following thoughts:
- The application should run on the JRE and use the Scala programming language for at least part of the functionality. I am learning to love functional programming, but haven't applied it to any practical application yet.
- If using a database, I would like to use db40 instead of a relational database. A lot of the stuff that slows down development with a database goes away when the Java class IS the database schema. No dual implementation (Java class and relational schema) or mappings are required. Of course, I may be better off persisting my data as flat files.
-
I should standardize on a single IDE (Eclipse or NetBeans), as this is primarily a single developer effort, and trying to develop in two different IDEs would require a lot of overhead. Unfortunately, this adds limitations to the technology I can use for development. For example, Scala only has a plug-in supporting Eclipse, I am more familiar with Eclipse's features, I like the idea of learning more about OSGI, and I like the responsiveness of SWT vs. Swing. On the flip side, only NetBeans has Matisse (for free anyway), the profiler seems nicer, RCP appears to be easier, I'm more familiar with the Swing API, I like the idea of a single distribution for all machines (SWT requires different DLLs), and I LOVE the mercurial plug-in.
- I don't expect this code will be opened for general availability, but I should take legal considerations (GPL, LGPL, EPL, BSD, etc.) into account when choosing libraries. Otherwise, the conflicting licenses could become prohibitively difficult to work around.
- I would like to leverage a multi-document interface (MDI), at least optionally, with tabs that can be reordered, maximized to full screen, closed with middle button click, and possibly minimized to border buttons.
- Ideally, data could be replicated between machines with ease, as I would like to use the application on my home and work machines. This, of course, requires multi-platform functionality, as my work laptop runs Windows XP while my home machine runs Ubuntu Linux. (Gosh how I wish I could run Ubuntu at work too...)
Clearly, I have more to think about before choosing an IDE. As for the applicaiton's features, I've considered the following:
- A secure password store
- A wiki style notebook similar to Tomboy Notes for Linux. I found that taking notes in ASCII with a markup "language" like markdown worked quite well. The file's name contains the timestamp of when the file was created, and all context of what the note was about is contained within. When the file is stored in the file system, the operating system's indexing tool can be leveraged for searches, and the application can manage categorization in a flexible way. I'd like to include IDE style tab completion for internal links, syntax highlighting, daily todo list feature, hyperlinks outside the application (e.g. web pages, lotus notes documents, network share locations, etc.) and more.
- A batch image resize tool for high quality images. I've actually already implemented this using the ImageJ library available in the public domain, but the GUI is a bit hackish and not robust.
- A contact management application that leverages low level relationships. For example, a family of five may all have the same contact information while the kids are young, but as each kid gets his/her own email address, mailing address (e.g. goes to college), cell phone, etc., each individual's contact information can be updated accordingly. Or, if the entire family moves to a new location, a single change to their mailing address node would update all family members at once. This would be a perfect application for db4o.
- An invoicing and general business tracking application for my wife to use for her photography business. This is a nontrivial piece of functionality with the potential to grow significantly. It could be used to drive marketing, schedule photo shoots, track income and expenses and so on.
I really need to have an IDE that makes the GUI layout and interaction with the business logic clear and easy to do in order to implement all that I would like. I suspect this seem daunting mostly because it is new to me. I'm still looking for some best practices on how to layout the MVC design of the GUI, though I suspect I'll come to a solid conclusion only after I've tried it.
NAS put on hold
As it turns out, life really is what happens when you are making other plans. Since the last time I posted, I've had a few things come along requiring big money... bummer. I guess I'll have to wait a little longer before I build this machine. I suppose this isn't all bad, as chances are the cost of these components go down over time, assuming the economy doesn't throw a wrench into that theory.
I did get close to a finalized configuration of components with a slight change to the machine's application. Originally, I wanted to make the Network Accessible Storage device double as the firewall and gateway for my home network to the internet. For security purposes, combining the firewall and storage isn't recommended, as a security failure in the firewall makes for easy access to all your data. Beside that is the fact I already have a wireless Netgear router that can also serve as a gateway for a local wired network.
I currently have a 31" LCD television, a HDHomeRun (network enabled digital TV receiver), and no way to connect the two. A simple and lightweight media PC would nicely bridge this gap, and many of my NAS requirements would also apply here. With a media PC, I would finally have the convenience of TiVo style TV watching, and I could create backup copies of the kids' videos that are at high risk of death by scratching. Of course, the sheer volume of data is a concern when dealing with multimedia, particularly video. To get the most out of my hard drive space and back-up media (CDs and DVDs), the machine would need to support efficient MPEG4 decoding at the very minimum. I have a Core2 duo workstation that could offload the conversion processing from MPEG2 to MPEG4 (or better), so I'm not too concerned with the media PC's processor power as long has it supports hardware decoding of MPEG4. Given these thoughts, I've put together the following list of components:
- JetWay J7F5M1G2E-VHE-LF CX700M VIA CX700M Mini ITX Motherboard/CPU Combo - This motherboard has integrated video capable of MPEG-2, MPEG-4, and WMV9 hardware decoding, high definition sound, gigabit Ethernet, HDMI output (not sure if this is video only or includes sound), and a 1GHz C7 processor that consumes a miniscule 9 watts of electricity. The board only supports 2 SATA 3.0 interfaces, so I would need to use the only PCI expansion slot to add support for disks 3 and 4. This board is fanless and therefore silent.
- picoPSU-120 Power Kit - A fanless and very space efficient power supply capable of providing 120 watts of power at 12 volts. This is probably overkill for this machine, so I may consider scaling back to a 90 watt power kit, but I would have to run the numbers a bit more.
- 1GB 240-Pin DDR2 SDRAM 533 (PC2 4200) Desktop Memory - I would order the cheapest respectable brand name memory on Newegg at the time I order. At present, this is the Kingston ValueRAM. The maximum memory supported by the motherboard is 1GB in a single memory card.
- Scythe S-FLEX SFF21D 120mm Case Fan - A single, very quiet 120mm fan that could completely replace the air in my custom case roughly one time per second, which should be sufficient for the passively cooled components within.
- Pioneer DVR-115DBK DVD Burner - This is an inexpensive and popular IDE drive. The motherboard supports up to 2 IDE devices, and I want to ensure the SATA connections are used for hard drives only.
I already have the hard drives and PCI to SATA 1.5 expansion card. These will need to be moved from their existing machines into the new one once it is built.
- Western Digital Caviar GP WD5000AACS 500GB 5400 to 7200 RPM 16MB Cache SATA 3.0Gb/s Hard Drive - One of Western Digital's new "green" line of hard drives. I have two of these configured in a Raid1 array.
- Rosewill RC-201 PCI SATA x2 Silicon Image, RAID 0/1/JBOD, Normal and Low Profile Host Controller Card - This is currently not used, as I have decommissioned the older machine that only had IDE ports on-board. It is a fairly inexpensive card that works nicely with Linux. I have no intention to use the card's RAID feature, as Linux has powerful software based RAID that I have fallen in love with :) I bought it for its support for 2 SATA 1.5 drives.
I haven't yet decided how I want to boot the machine. Idealy, the hard drives would exist solely for storage. No applications would be installed on them at all. There is one more IDE device supported (in addition to the optical drive), so I could install another hard drive. However, I would like to avoid using another hard drive for space and power reduction. I was thinking something like a Compact Flash card installed as a non-removable IDE device (requires adapter card) or a USB flash drive wired directly to the motherboard and rigged inside the case. I suspect the compact flash card would be more performant, but it would also cost a little more.
The parts listed here aren't super expensive. Unfortunately, this isn't the only costs involved. To build the machine I really want, I need to build my own case from scratch. I not only need to buy the materials (not sure what I want to use yet), but I also have to purchase some tools, like a dremel. I have some rough ideas on how I would like to set-up the inside of the case. Maybe I'll put together some pictures in the future, but until then, here is a brief description:
- Four 3.5" hard drive bays will be located in the bottom front of the case. Instead of configuring them horizontally, I want to take advantage of the natural airflow advantage of a vertical configuration. Proper venting is needed (e.g. modder's mesh) to allow airflow to pass directly over the hard drives. Maybe the bottom of the case could be vented for maximum intake.
- Just above the hard drives is where the optical drive will be installed in the traditional horizontal position. This makes all the drives fit into a single brick-like unit in a dense but fairly well vented layout.
- The motherboard will then be configured (looking at the front of the case) to the right of the drives. To do this, the drives, and therefore the optical drive's tray, will be off-center to the left. Depending on the PCI expansion slot's location, the case may need to be longer (move the motherboard further toward the back), taller (place card below drives), or a riser could be used to move the PCI card out of the way. Regardless, the motherboard needs to be flush with the back of the case to give access to the on-board ports, so this will require some designing finesse. Again, special care will be needed to ensure the passive cooling on the motherboard has sufficient air flow. This may require additional venting on the front of the case in front of the motherboard.
- The 120mm fan will be located on the back of the case near the top. This would be located where a traditional ATX case would place the power supply. One benefit to using a very low profile power supply inside the case. Of course, this does require the power converter to be outside the machine, similar to a laptop. I did want to include a laptop battery inside the machine for backup power, but this was prohibitively expensive.
With this layout, I'd like to get an overall size of 20cm wide x 20cm tall x 25cm deep, or roughly 8in wide x 8in tall x 10in deep.
I hope to sometime return to this project, as I suspect this would be a good little machine that is used a lot. Until then, maybe someone else out there could use some of my thoughts to build something similar. If you do or already have, please share your experience. I would love to hear about your successes and growing pains.
NAS - hard drive and transfer speeds
Okay, so I've got the general idea of what I want, now it is time to hash out some of the specs. Where to start?
The working parts of the device should be modern or at least based on modern standards. Since I will be going to all the trouble of building this device from scratch, I don't want to be put into a position where a significant redesign is needed in the event of a hardware failure of some kind.
- The primary hard drive interface should be SATA. The old PATA (a.k.a. IDE) hard drive standard isn't gone yet, but it is certainly fading away into the sunset, not to mention its technical inferiority. I won't rule out PATA completely for one or maybe two drives if the motherboard provides it. I'll need to be more careful when integrating PATA drives into the RAID configuration to prevent their technical limitations from killing the performance of the entire device.
- The motherboard would ideally use the ATX power connector standard. This dramatically increases the number of power supply options, as ATX is the most common standard available.
- The motherboard needs to support multiple SATA hard drives. The preferred number is 6 on-board connections, but if this is not possible in a small, power efficient machine, it should at least be upgradeable via an expansion card or two.
Generally speaking, a NAS device doesn't require much processing power. The largest bottleneck in performance will most likely be the network itself. Most home networks will use wired fast Ethernet (100 Megabits/sec), wired gigabit Ethernet (1000 Megabits/sec), wireless g (54 Megabits/sec), or wireless n (248 Megabits/second). Note, however, that these measurements use the bit as the base unit of measure, not the byte (8 bits) that most of us are comfortable with.
Fortunately, Wikipedia has an excellent collection of device bandwidths available at http://en.wikipedia.org/wiki/List_of_device_bandwidths that shows data transfer rates in both bits and bytes. It also appears to be a fairly complete list of device interfaces that may be used in this system.
One important point to keep in mind is the perceived performance of a NAS device will be effected by the slowest point in the data transfer chain. This includes the network adapter on the machine using the storage device and everything between the two machines. I've configured my home network to support gigabit transfer rates, so my NAS device should take full advantage of this. This means I should be able to get a theoretical top transfer rate of 1000 Mb (megabits) per second, which is equal to 125 MB (megabytes) per second.
Here is a breakdown of possible hard drive transfer rates for this device. SCSI is a bit out of my price range and overkill for a home solution.
- Ultra DMA ATA 66 - 528 Mbit/s = 66 MB/s
- Ultra DMA ATA 100 - 800 Mbit/s = 100 MB/s
- Ultra DMA ATA 133 - 1064 Mbit/s = 133 MB/s
- SATA 150 hard drive - 1500 Mbit/s = 187.5 MB/s
- SATA 300 hard drive - 3000 Mbit/s = 375 MB/s
The last three interfaces are all faster than the maximum transfer rate possible over a gigabit network and should be suitable for such an application. As stated before, other technical merits (hot pluggable, no restrictions on writing to multiple devices at once, newer standard with stronger future) make SATA the ideal option. In the current market, you would be hard pressed to find large SATA hard drives that don't conform to the 300 standard, and the price differences between 150 or 300 aren't significant. You are more likely to find disk controllers on the motherboard or expansion cards (e.g. PCI cards) that use the slower SATA 150 interface. Fortunately, virtually all SATA 300 hard drives are backward compatible with the SATA 150 controllers, so compatibility is virtually a non-issue.
Speaking of expansion cards, it is quite likely one will be needed in my device. This adds a new interface to take into account when assessing bandwidth bottlenecks. According to the Wikipedia page, the 32 bit PCI expansion slot running at 66 MHz (the most common kind of expansion slot today) has a theoretical maximum transfer rate of 2133 Mbit/s, or 266.7 MB/s. This means that one should not expect the best performance from a single SATA 300 or multiple SATA 150 drives connected to such an expansion card. Still, for the needs of a NAS device, a PCI expansion card supporting two SATA 150 drives should work nicely for expanding a RAID configuration. Possibly using a PATA style configuration (pairing a motherboard controller with a PCI controller for RAID1 mirroring arrays) would optimize reads and writes when dealing with large files.
Custom NAS Device
I've been toying with the idea of making a Network Addressable Storage (NAS) device for home use from scratch. For all intents and purposes, I already have. I've got an old machine that I built several years ago assigned to the task, as well as serving the network router, firewall and DNS cache. It has also recently come to my attention (via a newly acquired Kill-A-Watt power meter) that this machine sucks in roughly 120 Watts of electricity by itself. At roughly 10 cents a Kilowatt-hour, this comes to over $100 just to leave the machine on. Here I thought I was doing well by consolidating several devices (USB hard drives, router/firewall) into a single machine, but the power consumption of this one machine significantly outweighs the total of the smaller devices.
This got me thinking about what features I would want in my ideal NAS device. At a very high level, I want a device that stores and protects large volumes of data and does not impose itself on my daily life.
Reading back over the above high level requirement, I'm surprised how simple it sounds, yet how broad and non-trivial it really is. To break this down a bit more, I will dissect this sentence into more specific desires.
A Device that Stores and Protects Large Volumes of Data
- My and my family's need for storage space is forever increasing. If this machine is going to satisfy my needs long term, its storage space will also need to grow.
- Data redundancy across two or more hard drives is an important step toward safeguarding against data loss. As such, RAID should be a key feature.
- "Dirty" or unreliable power can be damaging to any computer. A secondary power supply like a UPS or integrated battery similar to a laptop would help to remediate this risk.
-
A case with sturdy construction, low center of mass and low profile would help to protect the device from minor bumps in high traffic areas.
Certainly, there are many other concerns when protecting data, but I am talking about a NAS device, not a disaster recovery plan.
Does Not Impose Itself on My Daily Life
- Minimal and easy maintenance. When something starts to go wrong, the system should notify me. Adding or swapping drives shouldn't require a tool chest and manual.
- File transfers to and from the device should be fast, even for large multimedia files.
- Sharing files with Windows and Unix operating systems should be seamless.
- Quiet and aesthetically pleasing. The device shouldn't draw any special attention when walking into a room due to bulky, ugly and loud construction. This is my primary reason for placing my current machine in the basement.
- Initial construction and running costs should be economical and environmentally green.
Devices similar to what I describe are already available on the market. For example, the Drobo "data robot" from Data Robotics, Inc. in combination with their DroboShare will do most of this for around $700 plus the cost of the hard drives. Still, I'm going to look into building one myself, even if only on paper. It would be fun to build, and I may even be able to one-up the Drobo by making my device more general purpose. For example, maybe I can make it a thin Linux machine that uses my 32 inch LCD TV as a monitor or a MythTV box using my HDHomeRun digital TV turners.
(0)