I think that way it works now is efficient and much better than if they included voice acting. Voice acting works for other games because they have realistic characters visually and personality wise. But in a franchise like Zelda, the world is filled with unique and ridiculous characters. So far, I believe that the simple sounds they make every so often is ample enough for the moment. It gives the characters' moods, and using text boxes allows a lesser need for a transition from these moods. Many characters throughout the series have had major mood swings, and trying to get a logical change through vocals is difficult, if not impossible.
Alongside this, sometimes players need to hear a character repeat what they say. Can you imagine how obnoxious this would become, to hear the same sound bit over and over again? With the text box system, one can simply skip over words until they clarify what the character explained. If voice acting doesn't prohibit the skipping of dialogue, it will certainly sound awkward if the player quickly skips through text boxes in order to finish the conversation, as the character would begin each text box as it appears.
Creating voice acting for cutscenes is another idea that would most likely fail. If these characters are only heard during these moments, the use of text boxes during the rest of the game would only feel emphasized and feel awkward.