Our data, wherever we are and whenever we need them

“Cloud computing,” like social networking (LinkedIn, Twitter, YouTube, et al.), is all the rage in this new era of distributed, collaborative, high-tech solutions to what are now everyday problems (some previously unknown until the time at which we arrived at a solution and forgot how we could have ever lived without them), and in the area of distributed, Internet-accessible file hosting, Dropbox reigns supreme.

Consider a fictional executive, Alice, who works frequently from home, and has a number of files that she needs to be able to access from both home and work. More importantly, she needs her files to stay synchronized between the two locations, so that if she updates an expense report at home in Upper Arlington she will have the up-to-date version when she arrives at work in Dublin, as well. Furthermore, while she is with her grandchildren in Manhattan, if she should realize that she forgot to record an expenditure, she needs to have access to that same up-to-date report.

In the past, Alice might have used a floppy disk, Zip drive, CD-RW, or, more recently, portable flash drive. But these media are susceptible to damage, and can be lost or stolen with relative ease. Adding encryption would keep Alice’s files secure, but at the expense of convenience and easy interoperability. She might choose, instead, simply to email the files to herself, but this solution is cumbersome, and sacrifices the ability easily to keep versions of her files in sync wherever she goes. Should Alice further discover that one of her fellow executives, Bob, also needs access to some of her files, both removable media and emailing herself become next to impossible to implement as workable solutions.

These types of problems are what Dropbox has been specifically designed to address. When someone downloads and installs Dropbox, the software (which includes a free version with limited but not insubstantial storage capacity) creates a special directory on that person’s computer (called his or her dropbox) which is linked automatically to the Dropbox servers. Any files placed in this folder are automatically backed up to Dropbox’s servers and stored securely and privately, accessible only that user’s Dropbox account. Whenever a file is updated, the Dropbox software updates it on the server as well, and since Dropbox detects what portion has been altered, the software does not transmit the entire file each time, but, rather, only the changes, making this process virtually instantaneous for most files.

Furthermore, the Dropbox software can be installed on multiple computers and linked to the same account, and whenever a file is updated (or added, or removed) in one of the linked locations, it is updated (respectively, added or removed) elsewhere as well. If a file is deleted by accident, Dropbox can bring the file back from the dead. If a change is made that is later found to be in error, the software allows the user seamlessly to roll the file back to a previous version. And, should the user use a computer that is not his or hers, Dropbox allows full access to his or her files through an intuitive, uncluttered web interface.

As if that weren’t enough, Alice’s desire to share some of her files with Bob are met by Dropbox as well. Our apocryphal user can place files in a “Public” folder, wherein he or she can create a special URL allowing anyone, whether or not he or she has a Dropbox account, to download and view the files. If more collaborative access is desired, one can create a “shared folder,” which allows any number of additional people with Dropbox accounts to have full access to all the files in the shared folder, all other features (version tracking, un-deleting, synchronization across computers) intact as well.

All of this is more easily experienced than explained. If you ever find yourself wanting to have access to certain files wherever you are, trying to keep track of something that needs to be maintained by multiple people in multiple locations, or even just wanting an easy way to back up some important files without having to worry about extra hard drives or stacks of disks, download Dropbox and give it a shot. You just might find that it solves a problem you didn’t even know you had.

Cloud Computing

Recently, there has been a push by companies like Microsoft, SalesForce, Amazon and Google, to use their cloud computing services as a platform to build applications. What makes this any different than running server within your office and storing your business data there? Nothing. You’re just outsourcing number crunching power and storage.

If you use Gmail/Yahoo! Mail/Hotmail- guess what? You’re already using a cloud application. You have no knowledge of where data exists nor do (or should) you really care. You can reach that data from any computer where you have access to the Internet, and the entry cost to that data in terms of hardware is low.

What does a business gain with moving to a cloud computing model? Cost savings in terms of hardware and data storage, and processing power and less reliance on internal servers for 100% availability. To take advantage of this savings, you become reliant on access to your chosen cloud service’s servers, this being your ISP (as if any of us wasn’t reliant on the web already), you have to rebuild your applications and you no longer have your institutional data in-house.

Rebuilding applications to take full advantage of this technology is no small undertaking. Data that resides in your existing data store will have to be ported into your chosen cloud service and any application logic that speaks to your current data will have to be rewritten.

Why is this?

atmostpheresmall In terms of Microsoft’s platform, Azure, a developer has to now conform to a new standard of data storage rather than the ORM or ADO.NET model. Azure is made to deliver mass amounts of data and provide redundancy and recovery features- in order to meet this goal, you have to do things the Azure way. You, as a developer, have a layer of abstraction that sits on top of a network of database servers and no knowledge of how the data is stored in the most basic sense.

I am not going to push for using a SAAS model of development. I, personally, am no salesman. I am not sure I could convince a business owner that 10 years of work should be moved off-site. This is not to say that you must have all of your data off-site, you can peruse a hybrid model as well. I can, however, give one guideline that can ease the transition should it be something that your company wants to do.

Remove all of your business logic from your database.

This, in and of itself, can be a troubling task. I have worked on many applications that have had stored procedures that performed business logic- this has to change in order to use a cloud based platform. There is no more access to the database, so you have to write code that modifies data in the form of a service that runs in the cloud. Encapsulation is key for the cloud model to work.

As I am a Microsoft based developer, I have been focused on the platform that Microsoft has provided (which is said to have Java support soon). Some other examples of cloud service hosts:

Perl? I’ve seen those

So this morning I was asked to get content from a website out in plain text.

htmlcontent Visually, this means that the HTML code over on the left, needs to be converted to straight text that can be viewed in notepad without all of the tags.plaintext

As I have done some screen scraping in the past for other jobs, I am familiar with the concept of taking data from a terminal screen and working with it. I have not, however, done any sort of screen scraping for web content. 

My first step in the process is to ask what other people have used. I don’t want to re-invent the wheel if possible. As I am the only developer here, I find it useful to post questions on Twitter, as most of the people I follow have some attachment to the technical industry. I use it as an open messaging system- kind of like shouting down a hall and seeing who answers.

My first response was HTML::Strip. As someone who has been a Windows programmer for most of his career, focused on Microsoft based products (and not really web based platforms), this told me absolutely nothing. Google (or Bing… I’m trying…) tells me that this is essentially a Perl module. Huh?

startmenustrawberry So begins my morning’s quest for knowledge. I do a bit of digging, and I found that what I need to do first is get a some type of Perl interpreter. These sort of things come with Unix/Linux… but Microsoft pays for my life, so I use Windows. The top of my result list was fine for me, so I went with Strawberry Perl as my interpreter of choice. 

As a value add, I get a CPAN Client which is essentially a universal installer utility for installing modules which can be consumed by Pearl script. Meaning that if you need to include a reads HTML pages and return their content to you, you just tell CPAN the name of the library and it magically installs! 

I need two things to get started:

  • HTML::Strip – the Perl library that strips content out of web pages
  • LWP::Simple – the Perl Library to manipulate HTML

cpaninstallAfter a bit more research I found that all I need to do is launch my CPAN Client and in the command prompt run install HTML::Strip, and install LWP::Simple. It’s really just that simple! No messing around with installer files. It just works. Now I can write a script that consumes those libraries.

This post is getting long and rather than just drag on with coding, here’s how we can scrape the text from a web page using Perl:

#!/usr/bin/perl

use HTML::Strip;
use LWP::Simple;

my $hs = HTML::Strip->new();
my $url = "http://www.google.com";
my $content = get($url);

my $clean_text = $hs->parse($content);
print $clean_text;
$hs->eof;

Done. That will write the contents of our $urlvariable’s web site to the screen. I save my script into a text file C:\strawberry\perl\Scripts\Scraper.pl (I use .pl as the file extension only for convention’s sake). 

scraped To execute my script, I open the command prompt and type perl C:\strawberry\perl\Scripts\Scraper.pl and the result is printed to the command window.

 

 

Obviously, I still have some work to do to make my little script a viable solution:

  • Scrape a specific target area on the web page rather than the whole page
  • Loop thru a list of pages to parse rather than just a single page

…but that’s the brunt of what I needed it to do. For all you Perl experts out there- I probably butchered your favorite language... sorry.

 

Security on the chopping block

Everyone is aware that budgets everywhere are shrinking. One area that should always remain a priority is information security.  Unfortunately, with the downturn in the market more and more IT departments are working with ever dwindling budgets. Budgets that often leave those implementing technology to have to make hard choices. And, the practicalities of continuing to keep the infrastructure running have taken priority over keeping the network safe.

Data intrusion is a constant threat in our modern world. Don’t think that someone is trying to access your data? Well you would be very terribly mistaken. People exist who try to gain any access to any data that they can get. Your data is at risk. This has been proven time and time again.

Open Source Logo

Companies and individuals need to take a long hard look at the cuts they are looking to make. Regular reviews of your infrastructure need to be undertaken. If you are in charge of IT, finances or just run your own business, you need to be aware of what measures are being taken to protect your data – and that those measures are adequate.

Tight budgets may in fact be here for the foreseeable future, but you don’t want to put off the security changes that your network needs because you cannot afford it. Instead of giving up because of your budget – start looking at the alternatives. Just because you can’t afford the package that everyone else is using, doesn’t mean that there isn’t something just as good, or at least far better than what you have, for much lower pricing  – or possibly even free.

Start looking at Open Source alternatives – open source products are often free and comparable to commercial products. Tons of software pieces exist. For example, need a VPN to connect to your office securely when remote? Try OpenVPN. Need a replacement for your aging firewall that doesn’t support newer protocols or provide the security that you require? Try SmoothWall. Need to replace your anti-virus with a lower cost solution? Checkout Clam (free) or F-Prot ($50/yr for 10 Computers).

The moral here: you don’t need to forgo the protection that you need – simply because your budget has become too tight. If you spend some time to look for a solution you might just find that the solution has been there for a while and at a much more reasonable cost than you had thought. You need to protect your network, your computers and your data. Don’t make the mistake that so many others have made – don’t put your security on the chopping block.

Content Management

Who needs revision tracking? I do, and I love it. I want to be able to see the changes made to a document or spreadsheet and the comments added along with a date. As a programmer I have used some form of source control for ten years and without knowing it, I have come to rely on it to keep track of changes. Consequently, I was able to roll a piece of code back to a version before I broke it. 

There are many terms for keeping track of versioning within a document. Over the years, our terms have changed and our ability to track changes has grown. DMS’s (Document Management Systems) became CMS’s (Content Management Systems) which then became ECMS’s (Electronic Content Management Systems). Why just let a document have all the fun? What about spreadsheets, images and executable?

There are hundreds of solutions to allow you to track versioning in your documents and all of them are better than searching through years of e-mails looking for the one sent by the colleague who had sent the version of the document that you want.

cms0 Right now I’m writing this article in Google Docs. If you have not used this solution to simplify your organization’s revision tracking, I suggest you take a look at it. I have found this to be the best solution for my personal documents because of the zero software footprint on my computers.

I can see the changes that were made between two different versions of this article. Should I need to compare the differences, Google Docs allows me to show that information, as well as tagging the changes with a comment. Most importantly, this tool scales well from one users to many.

To try and apply CMS concepts to the real world, think of this in terms of a sales proposal: a team of people working on a single document. We would have a technical group to gather requirements for the project, a sales group adding (and revising) the cost of products and services, and documentation group adding and tailoring verbiage to the specific client.

Over all of this activity, the account manager would be constantly reviewing the document. In our example, and probably more often than not- in practice, our account manager works externally, allowing very little physical contact with the team of people working on the proposal during the sales cycle.

In a world without Content Management, the sales manager gets separate e-mails from the technical staff, documentation team, and internal sales teams, each e-mail requires changes that will impact the other teams. However, each group is busy on many other internal projects and finding time to get the team together is difficult.

Now frustrated, the account manager edits each document from his hotel and replies to each team. Unwittingly, the sales manager has now just added more places to search for a document, by adding revisions and sending an e-mail, they now must search their ‘Sent Items’ each time they look for a copy of the document. Not to mention, each group not having access to the other’s changes until they are compiled into the draft version on the internal network. 

cms1 Enter the concept of content management. Using some sort of CMS system, the team works with a single document that can be modified with revision tracking. Our account manager can now see the changes by each user on the team. Because everyone is now using the same document, each team member’s changes can be seen by all others. 

Collaboration is now inherent to the system. The account manager can now make pricing changes owing to some lunchtime feedback from their prospect and the technical staff can adjust some of their hardware requirements. Rather than using a strikethrough font to tell a team member to remove a sentence, the sales manage can make the changes, and allow the CMS to show the differences in the versions.

From the very high level, a content management system is a package of services that allow users to store and track changes to a piece of information. That piece of information could be a spreadsheet, a web page, or a document.

Examples of ECMS:

To give credit where it is due: this post was written in response, and perhaps to elaborate on, a post by Brian Caldwell.

Polleverywhere fun and effective resource for speakers

For anybody that has frequent speaking engagements, Polleverywhere could be a great tool for you. Polleverywhere is an easy and effective way to poll your audiences, a la Who Wants to be a Millionaire’s, Ask the Audience feature.

Speakers can instantly poll their audience by using a poll that has been embedded into their PowerPoint presentations; or by using the Polleverywhere website. On the flip side, presentation attendees can vote on the poll by texting their answer and a voting keyword to a pre-determined number or through Twitter by adding @poll to your tweet.

The best part about Polleverywhere, is that the responses are displayed on-screen in real-time. This is a great way to move a presentation forward by anonymously gathering the thoughts and opinions of those in your audience.

Types of polls

Polleverywhere doesn’t just stop at multiple choice polls. In fact, the website allows for free text polling, which allows participants to answer more open-ended questions, such as, “Do you have any further questions for the presenter?”

Presenting at a fundraising event? Use a goal poll to show the audiences’ instant contributions using a rising thermometer. Participants to contribute a pledge just like they were texting in a response to one of the polls mentioned above.

How much does it cost?

Polleverywhere has six plans to choose from for business and non-profit use, ranging from their free Basic Plan to the Platinum Plan for $1,400 per month. Depending on your class size, the Basic Plan boasts some great features, including 30 votes per poll, PowerPoint polls, web voting, widgets, downloadable results, Twitter and Web-phone participation, and more.

Polleverywhere also has several free and paid plan choices for teachers in K-12 and higher education.

This just scratches the surface of what Polleverywhere has to offer. Check it out today.

CAUTION: “Free” wireless available here

The traveling business user has become accustomed to taking free wireless Internet at airports, hotels, coffee shops, bookstores and a multitude of other locations for granted. The convenience factor of being able to stay connected anywhere has become more of a necessity rather than a convenience. But the reality is that this free convenience comes at a cost: Security.

Security, or lack thereof, is the major problem with ‘free’ wireless Internet access at all these locations. Here are the problems: First, free wireless Internet is rarely ever encrypted leaving your data open to interception and possibly compromising sensitive data. Second, you have no way of knowing if the wireless access point you are connecting to is actually what it appears, just because it says ‘Airport’  or ‘Coffee Shop’ doesn’t mean that it isn’t really are what they say they are – often these can be access points that are maliciously set up in order to steal as much information as possible from you.

Now, I don’t want you to get frightened away from free wireless access points just because of these dangers. Free access points can be very beneficial for certain uses. But, you should take some precautions when using them.

Foremost, make sure that your computer has a firewall – even the built-in Windows Firewall is better than none at all. This severely limits the routes that attackers can use to get into your system.You should also avoid transmitting usernames, passwords, credit card numbers or other sensitive information unless you have a secure channel for transmission of the data, like SSL (this stands for Secure Socket Layer which is a data encryption method – you can recognize SSL based connections in your web browser by the HTTPS prefix on websites rather than HTTP which is used for unencrypted connections), or VPN (or Virtual Private Network – a method of securely connecting to a ‘trusted’ network – like your home or office via a client/server setup).

Avoid using your e-mail client unless your system uses encrypted connections (check with your IT department or e-mail provider if you’re unsure). If your e-mail client doesn’t connect securely and you use it, you could end up giving a rogue user access to your account information.

 There are alternatives to using free Wi-Fi:  Aircards or Tethering* allow you to use the network provided by your cellular company. While not foolproof – this method does mitigate the risks of your packets (data) being easily sniffed out by someone at the location you are at. Also, you will find that many airports (at kiosks) and hotels have wired Data Ports available – these can be lower risk than unsecured wireless.

And if you do spend a lot of time in hotels you may want to check into a portable wireless router (like the D-Link DWL-G730AP Pocket Router/AP).  This way you could connect to a hotel’s wired network with your portable wireless router and then have an encrypted wireless access of your own available. And not only that – while many hotels have free wired Internet, often you have to pay for wireless but not if you have a portable wireless router. And, considering the price some hotels charge for Wi-Fi, the portable wireless router could pay for itself in two nights at a hotel and start making returns for you.

Wherever you are and whatever method you choose to use, connect wherever you may be.  Remember to be mindful that there can be risks. And that “free” Wi-Fi could end up costing far more than you bargained for.

*Tethering is a method of getting wireless access by connecting your phone, PDA or other wireless device to your computer and using its Internet connection. Tethering is usually available for a modest fee – and offers similar speeds to an aircard.

Welcome to TechieBytes Podcasts

A weekly discussion of technology topics and trend affecting small business and associations.

  • Introductions
  • Social Media In Associations

Bill Sheridan, Maryland Society of CPAs
Chris Jenkins, Ohio Society of CPAs
David Gammel, High Context Consulting, LLC
Paul Schneider, Socious, Inc.
Direct download: TechieBytes_031009.mp3
Category: podcasts — posted at: 7:22 PM

Move over TwitPic – there’s a new kid in town

TweetPhoto is a free photo sharing service for Twitter and Facebook. TweetPhoto lets you share photos on Twitter and Facebook, and allows interaction with any user or photo.

Like TwitPic, TweetPhoto lets users upload photos by e-mail, through the web, or on their mobile phones, but the similarities end there.  tweetphoto

Some new features that TweetPhoto has to offer include the ability to see who has viewed your photos, the option to favorite or retweet any photo, and the ability to filter photos posted by your Twitter or Facebook friends. All photos posted are automatically geo-tagged which allows for later searching of photos, and the ability to see trending tags.

Check out TweetPhoto at www.tweetphoto.com.

Twazzup

twazzup

Twazzupis an efficient way to search and follow Twitter trends. It’s exceptionally good for following events and conference tweet streams. It offers a fast live streaming interface with the ability to filter and sort content within the stream. If you’re looking to mine information from Twitter I strongly suggest you try Twazzup.

Reblog this post [with Zemanta]